Panic Mode On (80) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (80) Server Problems?

Previous · 1 . . . 21 · 22 · 23 · 24 · 25 · Next
Author Message
Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8461
Credit: 48,818,671
RAC: 81,327
United Kingdom
Message 1332468 - Posted: 29 Jan 2013, 10:10:20 UTC - in response to Message 1332467.

But yes, there is a problem.

Yep, Scheduler borked again.
"Couldn't connect to server" once again the standard response.

The server status page froze at 08:30 UTC - once that happens, there's usually no scheduler service until the staff get to the lab and restart things.

Which, since it's Tuesday, means not until after maintenance.

And since 'ready to send' was below high water mark when the page froze, and the splitters were running, we'll probably have a big bloat of tasks to work off when things are working again.

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5790
Credit: 57,914,542
RAC: 47,893
Australia
Message 1332469 - Posted: 29 Jan 2013, 10:17:30 UTC - in response to Message 1332467.

But yes, there is a problem.

Yep, Scheduler borked again.
"Couldn't connect to server" once again the standard response.


Make that the only response.
The last few times the Scheduler was playing up hitting rerty a few hundred times would eventually report the work done & get a bit more, but not this time. Dead as a dodo.
____________
Grant
Darwin NT.

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2245
Credit: 8,586,026
RAC: 4,190
United States
Message 1332470 - Posted: 29 Jan 2013, 10:18:02 UTC

Well without a proxy, downloads are still questionable and fail often.. but I picked a proxy from the list and the 3 APs I had in my download queue were screaming in at 75-100KB/sec each.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5220
Credit: 283,952,516
RAC: 449,811
Brazil
Message 1332474 - Posted: 29 Jan 2013, 11:09:28 UTC

29/01/2013 09:05:19 | SETI@home | Scheduler request failed: Couldn't connect to server
29/01/2013 09:05:21 | | Internet access OK - project servers may be temporarily down.

Again? I´m tired...
____________

MikeN
Send message
Joined: 24 Jan 11
Posts: 301
Credit: 30,683,912
RAC: 37,224
United Kingdom
Message 1332497 - Posted: 29 Jan 2013, 13:26:27 UTC

Just to add insult to injury, SETI decided to declare all 180 tasks on my main cruncher 'abandoned' at 2AM this morning (UK time). After I rebooted and reset the project I have not been able to connect to SETI to get any new tasks, so it is now eating its way through Einstein and Cosmology and will probably stay that way until after the weekly outage, probably about another 8-9 hours:((
____________

Profile Ex
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 12 Mar 12
Posts: 2895
Credit: 1,727,936
RAC: 1,132
United States
Message 1332518 - Posted: 29 Jan 2013, 15:23:53 UTC
Last modified: 29 Jan 2013, 15:24:37 UTC

I don't seem to be able to upload tasks or get any at the moment.

I know it's Tuesday AM over in Cali, but isn't it too early for the server to be down?

I guess it's good I just bumped up my caches yesterday.
____________
-Dave #2

3.2.0-33

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38869
Credit: 577,574,280
RAC: 524,323
United States
Message 1332520 - Posted: 29 Jan 2013, 15:27:27 UTC - in response to Message 1332518.

I don't seem to be able to upload tasks or get any at the moment.

I know it's Tuesday AM over in Cali, but isn't it too early for the server to be down?

I guess it's good I just bumped up my caches yesterday.

Servers crashed last night. Bookmark the Cricket graph for future reference.
Hopefully they'll be back up later today after the usual maintenance outage.
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Profile Ex
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 12 Mar 12
Posts: 2895
Credit: 1,727,936
RAC: 1,132
United States
Message 1332522 - Posted: 29 Jan 2013, 15:33:39 UTC - in response to Message 1332520.

I don't seem to be able to upload tasks or get any at the moment.

I know it's Tuesday AM over in Cali, but isn't it too early for the server to be down?

I guess it's good I just bumped up my caches yesterday.

Servers crashed last night. Bookmark the Cricket graph for future reference.
Hopefully they'll be back up later today after the usual maintenance outage.

Thanks and thanks!
____________
-Dave #2

3.2.0-33

TPCBF
Send message
Joined: 18 May 99
Posts: 50
Credit: 989,780
RAC: 1,782
United States
Message 1332532 - Posted: 29 Jan 2013, 16:24:47 UTC - in response to Message 1331704.

Good to see I am not the only "old timer" still patiently crunching.
Altho with SETI being a bit squiffy I am afriad I have switched most of my CPU time to World Community Grid for the time being.
Now that explains why WCG went tits up yesterday as well... <LOL> (well, actually a rather sad situation)

Ralf

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38869
Credit: 577,574,280
RAC: 524,323
United States
Message 1332536 - Posted: 29 Jan 2013, 16:33:52 UTC - in response to Message 1332532.

Good to see I am not the only "old timer" still patiently crunching.
Altho with SETI being a bit squiffy I am afriad I have switched most of my CPU time to World Community Grid for the time being.
Now that explains why WCG went tits up yesterday as well... <LOL> (well, actually a rather sad situation)

Ralf

Actually, my CPUs are still all crunching Seti. It's the GPUs that quickly ran out and are now idling until the servers come back online. I sometimes run Einstein as a backup, but for now the kitties will just wait it out on standby.
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8461
Credit: 48,818,671
RAC: 81,327
United Kingdom
Message 1332580 - Posted: 29 Jan 2013, 22:48:44 UTC - in response to Message 1332522.

I don't seem to be able to upload tasks or get any at the moment.

I know it's Tuesday AM over in Cali, but isn't it too early for the server to be down?

I guess it's good I just bumped up my caches yesterday.

Servers crashed last night. Bookmark the Cricket graph for future reference.
Hopefully they'll be back up later today after the usual maintenance outage.

Thanks and thanks!

Bookmark the server status page while you're at it, and pay special attention to the [As of xxx] time in the top-left corner.

And since 'ready to send' was below high water mark when the page froze, and the splitters were running, we'll probably have a big bloat of tasks to work off when things are working again.

Now -

Results ready to send:1,444,950

Rolf
Send message
Joined: 16 Jun 09
Posts: 114
Credit: 7,816,984
RAC: 151
Switzerland
Message 1332586 - Posted: 29 Jan 2013, 23:12:01 UTC - in response to Message 1332580.

Results ready to send:1,444,950

AND

30.01.2013 00:01:01 | SETI@home | Scheduler request completed: got 0 new tasks
30.01.2013 00:01:01 | SETI@home | Project has no tasks available

SORRY! But messages like this only put more confusion to the user's brain than anything else does! Or is it because of my bad English?
When I started with SETI June 16th 1999 (my first account), I thought SETI would be easy understanding and it would not need a lot of button-pushing.
No it's confusing (see above) and my retry-button is almost kaput!

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3396
Credit: 46,337,824
RAC: 9,748
Russia
Message 1332598 - Posted: 30 Jan 2013, 0:16:31 UTC - in response to Message 1332359.

Been not watching closely over the traditional Australia Day long weekend chaos, and my machines were crunching when I looked occasionally. If I had stuck transfers I just put this retryMainTransfers.cmd in my scheduled tasks for every 20 mins or so:

@ECHO OFF boinccmd --get_file_transfers > mainxfers.txt FOR /F "tokens=1,2" %%i IN (mainxfers.txt) DO ( IF "%%i" EQU "name:" echo %%j IF "%%i" EQU "name:" boinccmd --file_transfer http://setiathome.berkeley.edu/ %%j retry )

Thanks! Will try
____________

Profile Ex
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 12 Mar 12
Posts: 2895
Credit: 1,727,936
RAC: 1,132
United States
Message 1332636 - Posted: 30 Jan 2013, 3:06:43 UTC - in response to Message 1332586.

Results ready to send:1,444,950

AND

30.01.2013 00:01:01 | SETI@home | Scheduler request completed: got 0 new tasks
30.01.2013 00:01:01 | SETI@home | Project has no tasks available

SORRY! But messages like this only put more confusion to the user's brain than anything else does! Or is it because of my bad English?
When I started with SETI June 16th 1999 (my first account), I thought SETI would be easy understanding and it would not need a lot of button-pushing.
No it's confusing (see above) and my retry-button is almost kaput!


...Every Tuesday morning (Pacific time) we begin a four hour data distribution outage for database and systems maintenance. The upload/download servers will be offline during this time. Afterwards you may experience connectivity issues for several more hours as the servers catch up with demand. 15 Jan 2013, 17:39:39 UTC

This is nothing new...
____________
-Dave #2

3.2.0-33

Profile Qui-Gon
Volunteer tester
Avatar
Send message
Joined: 15 May 99
Posts: 2910
Credit: 6,546,821
RAC: 1,714
United States
Message 1332639 - Posted: 30 Jan 2013, 3:17:29 UTC - in response to Message 1332636.

Results ready to send:1,444,950

AND

30.01.2013 00:01:01 | SETI@home | Scheduler request completed: got 0 new tasks
30.01.2013 00:01:01 | SETI@home | Project has no tasks available

SORRY! But messages like this only put more confusion to the user's brain than anything else does! Or is it because of my bad English?
When I started with SETI June 16th 1999 (my first account), I thought SETI would be easy understanding and it would not need a lot of button-pushing.
No it's confusing (see above) and my retry-button is almost kaput!


...Every Tuesday morning (Pacific time) we begin a four hour data distribution outage for database and systems maintenance. The upload/download servers will be offline during this time. Afterwards you may experience connectivity issues for several more hours as the servers catch up with demand. 15 Jan 2013, 17:39:39 UTC

This is nothing new...

You are correct, Ex, that this is nothing new, but it is happening after I downloaded some work, and while the Server Status page shows 1.4 million "Results" ready to send. It is still confusing and misleading to send a message that there is no work available when so much is shown on the Status page. (And I've been here a while, too.)

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2245
Credit: 8,586,026
RAC: 4,190
United States
Message 1332642 - Posted: 30 Jan 2013, 3:21:23 UTC
Last modified: 30 Jan 2013, 3:31:33 UTC

I like whatever got fixed during today's maintenance (if that even happened).

For the past few hours, every scheduler contact attempt results in a reply within 3 seconds, and I've been getting a lot of "got 1 new tasks" to go with them, and the AP starts to download, and goes through completion without a single hiccup at 10-15KB/sec. This is more like it.

Oh.. maybe that's because I still have a proxy enabled. Oops. Weird though, because the last time I was using a proxy, I would get HTTP error 417 for scheduler requests, and uploads wouldn't even go through, but everything is working perfectly with this proxy. Weird.


Also.. Holy Ready to Send buffer..

[As of 30 Jan 2013, 3:20:09 UTC] Results ready to send 1,162,432 12,917 0m

____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5790
Credit: 57,914,542
RAC: 47,893
Australia
Message 1332686 - Posted: 30 Jan 2013, 7:10:10 UTC - in response to Message 1332642.


AP waiting to validate & assimilate have cleared, the Scheduler is working & the huge Ready to Send buffer is rapidly shrinking down to a more normal size.
*fingers crossed*
____________
Grant
Darwin NT.

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8298
Credit: 55,072,256
RAC: 74,700
United Kingdom
Message 1332689 - Posted: 30 Jan 2013, 7:20:10 UTC - in response to Message 1332639.


You are correct, Ex, that this is nothing new, but it is happening after I downloaded some work, and while the Server Status page shows 1.4 million "Results" ready to send. It is still confusing and misleading to send a message that there is no work available when so much is shown on the Status page. (And I've been here a while, too.)


The message about "no work available" is certainly confusing, but its a standard message. The work is sent out in small batches of 100, once a batch has been assigned there is a short pause, and if you request work during that pause you get the message about "no work available". Confusing, but correct.
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2245
Credit: 8,586,026
RAC: 4,190
United States
Message 1332690 - Posted: 30 Jan 2013, 7:25:55 UTC
Last modified: 30 Jan 2013, 7:26:39 UTC

Actually, it is 200, and I don't know if it is still every 2 seconds or not. Used to be 100 every 2 seconds for the feeder. If it runs out before the next refill interval, you get "no work available." Mayhaps it should be updated to say "project has no work available at the moment."


Also, it looks like MB is about to run out of tapes to split. I think it would be an interesting experiment to run with AP-only for a day or two to see how Cricket looks. It will probably just stay maxed out though. MB-only just about fills it when the limits are not enforced.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5790
Credit: 57,914,542
RAC: 47,893
Australia
Message 1332692 - Posted: 30 Jan 2013, 7:33:14 UTC - in response to Message 1332690.


Probably tempting fate here, but even though downloads are as crappy as they have ever been, and it was a longer than usual outage, the Scheduler responses are coming through within 5 seconds in most cases.
____________
Grant
Darwin NT.

Previous · 1 . . . 21 · 22 · 23 · 24 · 25 · Next

Message boards : Number crunching : Panic Mode On (80) Server Problems?

Copyright © 2014 University of California