Panic Mode On (14) Server problems

Message boards : Number crunching : Panic Mode On (14) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 10 · Next

AuthorMessage
Profile Neil Blaikie
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 143
Credit: 6,652,341
RAC: 0
Canada
Message 878473 - Posted: 23 Mar 2009, 3:16:35 UTC

3/22/2009 7:47:11 PM|SETI@home|Message from server: Completed result 03ja09ad.1633.481.16.8.219_0 refused: result already reported as success


Not seen that message before when reporting
ID: 878473 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 878476 - Posted: 23 Mar 2009, 3:29:20 UTC - in response to Message 878473.  

3/22/2009 7:47:11 PM|SETI@home|Message from server: Completed result 03ja09ad.1633.481.16.8.219_0 refused: result already reported as success


Not seen that message before when reporting

That's nothing, really. During heavy network loads, your client reports a finished task to the server, and then the acknowledgment (ACK) from the server back to you doesn't make it through the network, because of the load, so your client tries to report it again.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 878476 · Report as offensive
Profile Neil Blaikie
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 143
Credit: 6,652,341
RAC: 0
Canada
Message 878483 - Posted: 23 Mar 2009, 3:56:42 UTC - in response to Message 878476.  

Thought it might be something simple.

Thanks for the response.
ID: 878483 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65971
Credit: 55,293,173
RAC: 49
United States
Message 878493 - Posted: 23 Mar 2009, 4:22:21 UTC
Last modified: 23 Mar 2009, 4:24:55 UTC

I'm getting this on downloads right now.

3/22/2009 9:20:28 PM SETI@home [error] File 11ja09aa.17799.14387.11.8.156 has wrong size: expected 375339, got 0
3/22/2009 9:20:28 PM SETI@home Started download of 11ja09aa.17799.14387.11.8.156

update: Something cleared, So that isn't an issue for Me now.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 878493 · Report as offensive
Profile Vipin Palazhi
Avatar

Send message
Joined: 29 Feb 08
Posts: 286
Credit: 167,386,578
RAC: 0
India
Message 878529 - Posted: 23 Mar 2009, 7:51:20 UTC
Last modified: 23 Mar 2009, 7:53:23 UTC

The downloads are fine, but noticed that none of the four MB validators are running. My pending credits just shot up by 10k since yesterday.
______________

ID: 878529 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 878628 - Posted: 23 Mar 2009, 18:02:37 UTC - in response to Message 878529.  

Indeed -- the validators all went offline sometime on Sunday and are still offline as of 11AM PDT today.

The downloads are fine, but noticed that none of the four MB validators are running. My pending credits just shot up by 10k since yesterday.



ID: 878628 · Report as offensive
-Bert-

Send message
Joined: 23 Mar 02
Posts: 152
Credit: 412,754
RAC: 0
Netherlands
Message 878675 - Posted: 23 Mar 2009, 20:30:11 UTC - in response to Message 878628.  

They're back online right now and crediting goes well again.
ID: 878675 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 878951 - Posted: 24 Mar 2009, 22:52:25 UTC


Ops.. unplanned outage after the weekly outage?



ID: 878951 · Report as offensive
Profile Blurf
Volunteer tester

Send message
Joined: 2 Sep 06
Posts: 8962
Credit: 12,678,685
RAC: 0
United States
Message 878961 - Posted: 24 Mar 2009, 23:06:58 UTC

Might want to read THIS.


ID: 878961 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 881366 - Posted: 1 Apr 2009, 18:51:34 UTC
Last modified: 1 Apr 2009, 18:54:40 UTC


Woohoo.. the train is on a straight track? ;-D




BTW.
Someone know when we could get new WUs?

ID: 881366 · Report as offensive
Profile DPRGI - Luivul

Send message
Joined: 24 Jan 03
Posts: 17
Credit: 20,639,801
RAC: 0
Italy
Message 881654 - Posted: 2 Apr 2009, 13:29:22 UTC

I get some WU but can't upload? I see the cricket graph and see abnormal traffic :( . What happend?
ID: 881654 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 881679 - Posted: 2 Apr 2009, 14:57:11 UTC - in response to Message 881654.  

I get some WU but can't upload? I see the cricket graph and see abnormal traffic :( . What happend?

The flood gates opened. Matt turned all the MB splitters on yesterday, so there is much more work available to download, and thousands of clients world-wide ended up draining their cache, and are trying to refill it.

Once the bandwidth is no longer maxed out (as shown by the cricket graph, or that picture above..which is static), uploads should flow just fine.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 881679 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 881737 - Posted: 2 Apr 2009, 18:08:21 UTC


No, the graph I posted isn't static.. it's 'real time, live' or what ever.. ;-)

ID: 881737 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 881752 - Posted: 2 Apr 2009, 18:35:36 UTC - in response to Message 881737.  
Last modified: 2 Apr 2009, 18:37:10 UTC


No, the graph I posted isn't static.. it's 'real time, live' or what ever.. ;-)

Alright. I'm used to static pictures being posted. Packets/sec is a strange way of looking at it though. I have octets (bytes)/sec bookmarked (link here).

Oh, and my official submission for panic at the moment is.. I have 10 WUs that won't upload! One of them has an hour and a half of upload time on it. I know there's the backoff mechanism that they run through, but every 20-30 minutes I'll pick one and tell it to retry now, and that just doesn't work.

Oh well, I tried.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 881752 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 881774 - Posted: 2 Apr 2009, 19:53:22 UTC

Alright, [/panic]. They all uploaded at the same time (select all, retry now). It's too early to tell, but it looks like the bytes/sec rate has come down from the ceiling of ~93mbit to ~90mbit. That 3mbit drop in the project outbound has correlated to a 3mbit rise for inbound.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 881774 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 882096 - Posted: 3 Apr 2009, 21:56:46 UTC

Well, it appears (at 2155utc) that the database issues must be done. All four MB validators have been turned on, and I think the "waiting for assimilation" queue is getting smaller. There are several hundred thousand results/wus in the "waiting for deletion" queue.

And it must be doing something, because the splitters are creating at ~25/sec and there is a RTS queue. Of course the server status page isn't a "real time" picture, nor very accurate, but it's a general picture at least.

Yay!
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 882096 · Report as offensive
Niteryder
Volunteer tester

Send message
Joined: 1 Mar 99
Posts: 64
Credit: 22,663,988
RAC: 18
United States
Message 882294 - Posted: 4 Apr 2009, 16:47:36 UTC

Something has gone wrong. The graph has bottemed out and web pages are slow to load. Also I cannot get any mb jobs.
ID: 882294 · Report as offensive
Andy Williams
Volunteer tester
Avatar

Send message
Joined: 11 May 01
Posts: 187
Credit: 112,464,820
RAC: 0
United States
Message 882296 - Posted: 4 Apr 2009, 16:51:12 UTC - in response to Message 882294.  

It's the weekend...
--
Classic 82353 WU / 400979 h
ID: 882296 · Report as offensive
Profile KWSN Ekky Ekky Ekky
Avatar

Send message
Joined: 25 May 99
Posts: 944
Credit: 52,956,491
RAC: 67
United Kingdom
Message 882318 - Posted: 4 Apr 2009, 18:08:46 UTC - in response to Message 882296.  

Good cache, nice cache.......

ID: 882318 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 882602 - Posted: 5 Apr 2009, 20:54:02 UTC

Looks like the assimilators are crunching away. "waiting for assimilation" is down under 1M, and the "waiting for db purge" queue is into the 7-digit range. Now just waiting for those purge queues to be processed and that should free up a TON of storage space and allow the splitters to provide a stable amount of work.

Looks like the storm is over. :D
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 882602 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 10 · Next

Message boards : Number crunching : Panic Mode On (14) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.