Panic Mode On (30) Server problems

Message boards : Number crunching : Panic Mode On (30) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next

AuthorMessage
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 976733 - Posted: 8 Mar 2010, 15:59:46 UTC - in response to Message 976716.  

I've Been getting "no such task" when attempting to look at any finished WU's.

Is the DB doing a crash & burn??

Ed F

Tasks are only kept for usually 24 hours after it has been validated. It is completely normal for a "recent" task to disappear.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 976733 · Report as offensive
EdwardPF
Volunteer tester

Send message
Joined: 26 Jul 99
Posts: 389
Credit: 236,772,605
RAC: 374
United States
Message 976738 - Posted: 8 Mar 2010, 16:29:34 UTC - in response to Message 976733.  

yes, that's true, but I can't find anything from today or yesterday ... maybe it's just my old eyes.

Ed F
ID: 976738 · Report as offensive
Profile dnolan
Avatar

Send message
Joined: 30 Aug 01
Posts: 1228
Credit: 47,779,411
RAC: 32
United States
Message 976741 - Posted: 8 Mar 2010, 16:31:21 UTC - in response to Message 976738.  

yes, that's true, but I can't find anything from today or yesterday ... maybe it's just my old eyes.

Ed F


I believe that's because the replica database is off line right now.

-Dave
ID: 976741 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 976761 - Posted: 8 Mar 2010, 17:26:38 UTC

Replica DB is now back online.....but about 41 hours behind and catching back up slowly.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 976761 · Report as offensive
EdwardPF
Volunteer tester

Send message
Joined: 26 Jul 99
Posts: 389
Credit: 236,772,605
RAC: 374
United States
Message 976770 - Posted: 8 Mar 2010, 18:42:34 UTC - in response to Message 976761.  

Yes! That would make a LOT of sense ...

Ed F
ID: 976770 · Report as offensive
Profile ccappel
Avatar

Send message
Joined: 27 Jan 00
Posts: 362
Credit: 1,516,412
RAC: 0
United States
Message 976776 - Posted: 8 Mar 2010, 19:25:44 UTC - in response to Message 976761.  

Replica DB is now back online.....but about 41 hours behind and catching back up slowly.

I'd say slowly alright...for each 10 minute interval the number of seconds behind reduces by about 1000 (16.67 minutes). At this rate it will catch up by the time the database goes down for maint tomorrow.
"Life is a tragedy for those who feel, and a comedy for those who think."

"I never get into an argument that I cannot win."
ID: 976776 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 976784 - Posted: 8 Mar 2010, 19:48:29 UTC - in response to Message 976776.  

well that'll solve a lot of problems won't it.


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 976784 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65745
Credit: 55,293,173
RAC: 49
United States
Message 976867 - Posted: 9 Mar 2010, 1:30:55 UTC - in response to Message 976784.  

well that'll solve a lot of problems won't it.

I'm thinking We may If We're lucky have a short outage(I hope), Otherwise who knows how long It could last...
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 976867 · Report as offensive
Dave Mickey

Send message
Joined: 19 Oct 99
Posts: 178
Credit: 11,122,965
RAC: 0
United States
Message 976868 - Posted: 9 Mar 2010, 1:35:40 UTC

I'm in the same boat as EdwardPF, maybe with the old eyes bit......but its pretty clear that the user WU list pages have not gotten new data for something like 24+ hours. There is *older* stuff still there, but I'm showing no uploads recorded and no new work having been issued since mid 3/7, and here we are entering 3/9 (UTC). Some of the uploads that are not there were just uploaded minutes before looking (by virtue of button abuse!!). Sorry, with a new i7 online since the weekend I'm being a credit hound to see what it can do.....

Anyway, hopefully the replica db thing is whats going on here, and soon the old history will become evident to us (and boincstats, etc......)

I don't know the flow well enough to understand why the WU's are not visible in the primary (non-replica) db, but that's ok.


Dave

ID: 976868 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 976871 - Posted: 9 Mar 2010, 1:51:26 UTC - in response to Message 976868.  

Anytime you look at a workunit or list of workunits, it is pulled from the replica to lighten the load on the primary.

When the replica is behind, it will not show work that you have recently uploaded or downloaded until it catches up to where we are now.

ID: 976871 · Report as offensive
Dave Mickey

Send message
Joined: 19 Oct 99
Posts: 178
Credit: 11,122,965
RAC: 0
United States
Message 976878 - Posted: 9 Mar 2010, 2:42:44 UTC

Ahhhh, thank you.

Had to look it up, but I drove thru Golden Valley on a overnight trip to the big city of Laughlin once back about 5 years ago (from Scottsdale via Kingman). Was a bit surprised by a hail/snow storm will crossing that range between Kingman and Laughlin (and this was not winter...).

Although I have not ventured into WU, check my Davis unit here:

http://dwmickey.home.mindspring.com/weather/

We're a bit wet here recently.......

Dave

ID: 976878 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 976951 - Posted: 9 Mar 2010, 11:58:28 UTC
Last modified: 9 Mar 2010, 11:59:31 UTC

back to the old http errors little wus I expected timewise no vaders down again
ID: 976951 · Report as offensive
Wandering Willie
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 136
Credit: 2,127,073
RAC: 0
United Kingdom
Message 976955 - Posted: 9 Mar 2010, 13:18:05 UTC

Replica Database is now back online. Should be updated just in time for today's outage. Looks like another wait after.

Michael
ID: 976955 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 977038 - Posted: 10 Mar 2010, 3:43:45 UTC - in response to Message 976955.  


Hmm, something's not right since the outage. Still getting the Scheduler request failed: Server returned nothing (no headers, no data) messages on various attempts, but also getting No Work Available messages, even though there's plnety of work ready to go and not much network traffic.
Grant
Darwin NT
ID: 977038 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19060
Credit: 40,757,560
RAC: 67
United Kingdom
Message 977058 - Posted: 10 Mar 2010, 5:59:53 UTC - in response to Message 977038.  


Hmm, something's not right since the outage. Still getting the Scheduler request failed: Server returned nothing (no headers, no data) messages on various attempts, but also getting No Work Available messages, even though there's plnety of work ready to go and not much network traffic.

Take a look at the Problems thread, top sticky.
ID: 977058 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 977062 - Posted: 10 Mar 2010, 6:17:21 UTC - in response to Message 977058.  
Last modified: 10 Mar 2010, 6:18:00 UTC


Hmm, something's not right since the outage. Still getting the Scheduler request failed: Server returned nothing (no headers, no data) messages on various attempts, but also getting No Work Available messages, even though there's plnety of work ready to go and not much network traffic.

Take a look at the Problems thread, top sticky.

Yep, but it seems to have become worse since the outage.

Before the outage i might get a couple of the messages every now and then, but things were still happening with no real delay. Now, those messages are the norm, and contacting the scheduler & getting work is much less frequent- it's taking a lot of attempts.
And the network traffic is way down on what you'd expect it to be after an outage, and considerably down on what it was before the outage.
Grant
Darwin NT
ID: 977062 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65745
Credit: 55,293,173
RAC: 49
United States
Message 977065 - Posted: 10 Mar 2010, 6:32:35 UTC - in response to Message 977062.  
Last modified: 10 Mar 2010, 6:33:00 UTC


Hmm, something's not right since the outage. Still getting the Scheduler request failed: Server returned nothing (no headers, no data) messages on various attempts, but also getting No Work Available messages, even though there's plnety of work ready to go and not much network traffic.

Take a look at the Problems thread, top sticky.

Yep, but it seems to have become worse since the outage.

Before the outage i might get a couple of the messages every now and then, but things were still happening with no real delay. Now, those messages are the norm, and contacting the scheduler & getting work is much less frequent- it's taking a lot of attempts.
And the network traffic is way down on what you'd expect it to be after an outage, and considerably down on what it was before the outage.

I stopped and then restarted Boinc 6.10.36(XP x64) prior to 4:30pm PST and I've seen the no headers no data statement only 3 times in the messages section of Boinc, After 7:20pm I've not seen It once and My PC has requested and successfully downloaded new WU's, So I don't see a problem here on My PC and It may be the server was just busy those 3 times, possibly.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 977065 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 977067 - Posted: 10 Mar 2010, 7:20:03 UTC - in response to Message 977065.  

After 7:20pm I've not seen It once and My PC has requested and successfully downloaded new WU's, So I don't see a problem here on My PC and It may be the server was just busy those 3 times, possibly.

Something's still not right- looking at the network traffic the inbound is much higher than it normally is, and the outbound lower.
Grant
Darwin NT
ID: 977067 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 977071 - Posted: 10 Mar 2010, 8:25:49 UTC
Last modified: 10 Mar 2010, 8:39:39 UTC

Half of the time I have the no headers, no data answer, the other half of the time I get:

10-Mar-10 09:07:43 SETI@home Sending scheduler request: To fetch work.
10-Mar-10 09:07:43 SETI@home Requesting new tasks
10-Mar-10 09:07:43 SETI@home [sched_op_debug] CPU work request: 1.00 seconds; 0.67 idle CPUs
10-Mar-10 09:07:48 SETI@home Scheduler request completed: got 0 new tasks
10-Mar-10 09:07:48 SETI@home [sched_op_debug] Server version 611
10-Mar-10 09:07:48 SETI@home Message from server: (Project has no jobs available)
10-Mar-10 09:07:48 SETI@home Project requested delay of 11 seconds

So whichever message, I get no work. ;-)

Now also interesting to see is that I usually get the no headers, no data on a manual update. As long as I let BOINC handle the communications, I get the no work available message.

Edit: having posted that I miraculously get work of course. ;-)
ID: 977071 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 977335 - Posted: 11 Mar 2010, 4:35:37 UTC
Last modified: 11 Mar 2010, 4:37:42 UTC

Getting none of the No headers: no data messages, but now getting lots of No work available messages.
Still getting work after it re-tries a few times.

Network traffic still doesn't look right IMHO- still much more inboud traffic than the outbound would justify, probably as a result of whatever is causing the no header: no data/ no work available problem.

Also noticed that for whatever reason the assimiliators have been struggling. Usually you get a backlog after an outage, but it clears away quite quickly. But lately backlogs have been occuring & not clearing even when there hasn't been an outage.
Grant
Darwin NT
ID: 977335 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next

Message boards : Number crunching : Panic Mode On (30) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.