Panic Mode On (46) Server problems

Message boards : Number crunching : Panic Mode On (46) Server problems

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 12 · Next

AuthorMessage
-BeNt-
Avatar

Send message
Joined: 17 Oct 99
Posts: 1234
Credit: 10,116,112
RAC: 0
United States
Message 1094939 - Posted: 8 Apr 2011, 23:13:18 UTC - in response to Message 1094824.  


A feature added to BOINC specifically for AP causes the estimate to be multiplied by 1.3 before comparing to the delay_bound = 25*86400 set by the ap_splitter. That multiplier allows for heavily blanked tasks when the application calculates a lot of shaped noise replacement data, which can cause run time to increase by about 30%.

As you deduced that doesn't reduce the deadline, it simply affects whether the task gets sent or not. Also, the situation is seldom so simple because there are usually other tasks already on the host.
                                                                 Joe


Interesting thanks for the information!

Traveling through space at ~67,000mph!

ID: 1094939 · Report as offensive
ClaggyProject Donor
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4623
Credit: 46,349,329
RAC: 2,924
United Kingdom
Message 1095861 - Posted: 10 Apr 2011, 21:29:47 UTC
Last modified: 10 Apr 2011, 21:33:43 UTC

Uploads have dropped to Zero, and downloads are dropping too, scheduler requests also fail,

Claggy

ID: 1095861 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1095865 - Posted: 10 Apr 2011, 21:55:31 UTC - in response to Message 1095861.  

Whatever it was didn't last long. I just got some new work and uploaded and reported a couple.




PROUD MEMBER OF Team Starfire World BOINC

ID: 1095865 · Report as offensive
ClaggyProject Donor
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4623
Credit: 46,349,329
RAC: 2,924
United Kingdom
Message 1095867 - Posted: 10 Apr 2011, 21:58:46 UTC - in response to Message 1095865.  

My uploads went through and reported too,

Claggy

ID: 1095867 · Report as offensive
Kevin Olley

Send message
Joined: 3 Aug 99
Posts: 502
Credit: 46,943,473
RAC: 13,847
United Kingdom
Message 1096722 - Posted: 13 Apr 2011, 6:16:28 UTC

Problems with the replica database, ATM its 15,047 seconds behind the master.



Kevin


ID: 1096722 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6100
Credit: 155,249,621
RAC: 49,568
United States
Message 1096791 - Posted: 13 Apr 2011, 12:38:55 UTC - in response to Message 1096722.  
Last modified: 13 Apr 2011, 12:39:19 UTC

Problems with the replica database, ATM its 15,047 seconds behind the master.



That is rather normal while the replica catches up after maintenance. Six hours later and it is down to about 2500 seconds.
Sometimes you might even observe the time go up for a few hours after everything comes back online. IIRC the replica recovers in about 12-24 hours after everything is brought back up.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the BP6/VP6 User Group today!

ID: 1096791 · Report as offensive
kittymanProject Donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 45918
Credit: 815,259,469
RAC: 124,967
United States
Message 1097047 - Posted: 14 Apr 2011, 6:50:01 UTC

Oh, meow.

Downloads seem to have died.

Uppies and reporting still working.


Cats.....what more does one need?

Have made friends in this life.
Most were cats.

ID: 1097047 · Report as offensive
Profile Robert J
Avatar

Send message
Joined: 30 Mar 00
Posts: 109
Credit: 13,755,932
RAC: 8,238
United States
Message 1097058 - Posted: 14 Apr 2011, 8:13:33 UTC
Last modified: 14 Apr 2011, 8:14:23 UTC

Looks like the Cricket Graph has taken a nose dive.

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=%2Frouter-interfaces%2Finr-250%2Fgigabitethernet2_3;view=Octets;ranges=d

ID: 1097058 · Report as offensive
Richard HaselgroveProject Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11141
Credit: 83,774,560
RAC: 46,206
United Kingdom
Message 1097064 - Posted: 14 Apr 2011, 9:22:48 UTC - in response to Message 1097058.  

Looks like the Cricket Graph has taken a nose dive.

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=%2Frouter-interfaces%2Finr-250%2Fgigabitethernet2_3;view=Octets;ranges=d

Quoting from Lunatics:

The grapevine says 'Gowron failed a drive and is hung. It will probably be 3-4 days to re-sync the RAID array, so no work until then.'

ID: 1097064 · Report as offensive
Profile KWSN Ekky Ekky Ekky
Avatar

Send message
Joined: 25 May 99
Posts: 937
Credit: 20,569,067
RAC: 8,991
United Kingdom
Message 1097068 - Posted: 14 Apr 2011, 9:46:51 UTC

Time to call Dyno Rod.
They'll be there soon after the team gets into work.



ID: 1097068 · Report as offensive
-BeNt-
Avatar

Send message
Joined: 17 Oct 99
Posts: 1234
Credit: 10,116,112
RAC: 0
United States
Message 1097070 - Posted: 14 Apr 2011, 10:02:51 UTC - in response to Message 1097064.  

Looks like the Cricket Graph has taken a nose dive.

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=%2Frouter-interfaces%2Finr-250%2Fgigabitethernet2_3;view=Octets;ranges=d

Quoting from Lunatics:

The grapevine says 'Gowron failed a drive and is hung. It will probably be 3-4 days to re-sync the RAID array, so no work until then.'


GRRR....and just as I got my second card back in from eVGA. Hopefully they ole' cache last one more test!....err that should be a period....
Traveling through space at ~67,000mph!

ID: 1097070 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6100
Credit: 155,249,621
RAC: 49,568
United States
Message 1097072 - Posted: 14 Apr 2011, 10:14:59 UTC

One of my machines at home was mid downloading some tasks when everything went splat about 3:30 UTC. Then it looks like it cleared up about 5:00 UTC as the tasks finished downloading and a few more were requested and downloaded. Then again around 7:00 UTC tasks are no go for d/l.


SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the BP6/VP6 User Group today!

ID: 1097072 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 2871
Credit: 10,621,929
RAC: 328
United States
Message 1097073 - Posted: 14 Apr 2011, 10:22:04 UTC

Yeah, I've got four APs that haven't started at all. Logs show HTTP errors starting at 0736UTC. Guess I'll suspend network comms. until it is fixed.


Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)

ID: 1097073 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6100
Credit: 155,249,621
RAC: 49,568
United States
Message 1097084 - Posted: 14 Apr 2011, 10:57:52 UTC - in response to Message 1097073.  

Yeah, I've got four APs that haven't started at all. Logs show HTTP errors starting at 0736UTC. Guess I'll suspend network comms. until it is fixed.

As upload/reporting is still working, for now :), you could set NNT to stop the inflow of work if it is going to be a few days to fix.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the BP6/VP6 User Group today!

ID: 1097084 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 2871
Credit: 10,621,929
RAC: 328
United States
Message 1097087 - Posted: 14 Apr 2011, 11:39:54 UTC - in response to Message 1097084.  

Yeah, I've got four APs that haven't started at all. Logs show HTTP errors starting at 0736UTC. Guess I'll suspend network comms. until it is fixed.

As upload/reporting is still working, for now :), you could set NNT to stop the inflow of work if it is going to be a few days to fix.

True, but that doesn't keep the pending downloads from retrying, filling my log up and making unnecessary connection attempts. Just easier to suspend comms. and wait for things to be fixed.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)

ID: 1097087 · Report as offensive
Richard HaselgroveProject Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11141
Credit: 83,774,560
RAC: 46,206
United Kingdom
Message 1097088 - Posted: 14 Apr 2011, 11:41:31 UTC - in response to Message 1097084.  

Yeah, I've got four APs that haven't started at all. Logs show HTTP errors starting at 0736UTC. Guess I'll suspend network comms. until it is fixed.

As upload/reporting is still working, for now :), you could set NNT to stop the inflow of work if it is going to be a few days to fix.

Any downloads you already have assigned will quieten down by 'project backoff', and keep the log reasonably clean. For the time being, uploads seem fine (except at Beta, which is stuck on uploads too, but accepting reports).

ID: 1097088 · Report as offensive
Profile Miep
Volunteer moderator
Avatar

Send message
Joined: 23 Jul 99
Posts: 2412
Credit: 351,996
RAC: 0
Message 1097092 - Posted: 14 Apr 2011, 12:04:08 UTC - in response to Message 1097088.  

Yeah, I've got four APs that haven't started at all. Logs show HTTP errors starting at 0736UTC. Guess I'll suspend network comms. until it is fixed.

As upload/reporting is still working, for now :), you could set NNT to stop the inflow of work if it is going to be a few days to fix.

Any downloads you already have assigned will quieten down by 'project backoff', and keep the log reasonably clean. For the time being, uploads seem fine (except at Beta, which is stuck on uploads too, but accepting reports).


Isn't the backoff only 2 hours max in the 6.10 branch? iirc we went up to 24h max on 6.12
Carola
-------
I'm multilingual - I can misunderstand people in several languages!

ID: 1097092 · Report as offensive
Richard HaselgroveProject Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11141
Credit: 83,774,560
RAC: 46,206
United Kingdom
Message 1097109 - Posted: 14 Apr 2011, 13:07:48 UTC - in response to Message 1097092.  

Yeah, I've got four APs that haven't started at all. Logs show HTTP errors starting at 0736UTC. Guess I'll suspend network comms. until it is fixed.

As upload/reporting is still working, for now :), you could set NNT to stop the inflow of work if it is going to be a few days to fix.

Any downloads you already have assigned will quieten down by 'project backoff', and keep the log reasonably clean. For the time being, uploads seem fine (except at Beta, which is stuck on uploads too, but accepting reports).

Isn't the backoff only 2 hours max in the 6.10 branch? iirc we went up to 24h max on 6.12

Even the basic 'per task' backoff and retry always went up to a possible limit of four hours (though randomised within that range). I don't remember 'project backoff' ever being shorter, but I'll try and find when it was introduced.

ID: 1097109 · Report as offensive
kittymanProject Donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 45918
Credit: 815,259,469
RAC: 124,967
United States
Message 1097120 - Posted: 14 Apr 2011, 14:03:02 UTC - in response to Message 1097064.  
Last modified: 14 Apr 2011, 14:36:32 UTC

Looks like the Cricket Graph has taken a nose dive.

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=%2Frouter-interfaces%2Finr-250%2Fgigabitethernet2_3;view=Octets;ranges=d

Quoting from Lunatics:

The grapevine says 'Gowron failed a drive and is hung. It will probably be 3-4 days to re-sync the RAID array, so no work until then.'

Ooooohhh.
Those be sour grapes.

And I should add that I am only referring to the grapes themselves that are sour...LOL. Not any opinions about said grapes.
Meow.
Cats.....what more does one need?

Have made friends in this life.
Most were cats.

ID: 1097120 · Report as offensive
Richard HaselgroveProject Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11141
Credit: 83,774,560
RAC: 46,206
United Kingdom
Message 1097130 - Posted: 14 Apr 2011, 15:18:58 UTC - in response to Message 1097120.  

The grapes that are growing on the grapevine, you mean? Yes, they are sour indeed. But don't shoot the messenger, please - only trying to help :-)

ID: 1097130 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 12 · Next

Message boards : Number crunching : Panic Mode On (46) Server problems


 
©2016 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.