Panic Mode On (78) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (78) Server Problems?

Previous · 1 · 2 · 3 · 4 · 5 . . . 22 · Next
Author Message
Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 372
Credit: 3,007,194
RAC: 2,108
United States
Message 1301759 - Posted: 3 Nov 2012, 18:16:54 UTC - in response to Message 1301757.

That's good for you msattler but the last time I got tasks actually downloaded to my cruncher was over a day ago. Sure I've got a load of new tasks assigned to me but they are "ghosts", server assigned but not actually downloaded.
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

Cherokee150
Send message
Joined: 11 Nov 99
Posts: 112
Credit: 25,659,673
RAC: 8,378
United States
Message 1301763 - Posted: 3 Nov 2012, 18:22:49 UTC

I feel I should ask my question again.

Does anyone know if the SETI staff is aware of this problem? It doesn't look like it would be easy for them to spot the trouble from their end right away. If that's true, then perhaps someone should make sure they know.

Profile Dannis
Send message
Joined: 29 Jan 06
Posts: 24
Credit: 5,745,100
RAC: 1,520
United States
Message 1301764 - Posted: 3 Nov 2012, 18:23:43 UTC - in response to Message 1301756.

Thanks for looking. I have tried the no new task option twice and they are still showing in my tasks window. I understand we have a problem getting work units. I have my preferences set to take any type work unit. I am still not getting units. Is the work scheduler down or we just out of units?

sonicthe
Send message
Joined: 26 Oct 00
Posts: 2
Credit: 612,921
RAC: 0
United States
Message 1301765 - Posted: 3 Nov 2012, 18:24:43 UTC

Keith,

I've got the same indications you have. The web says I have 31 "In Progress", but none of them have actually downloaded, and I was able to clear the "Ready to Report" list by updating with NNT selected.

I still get this in my event log:
11/03/12 14:02:19 | SETI@home | Requesting new tasks for CPU
11/03/12 14:07:45 | SETI@home | Scheduler request failed: Timeout was reached
11/03/12 14:07:47 | | Project communication failed: attempting access to reference site
11/03/12 14:07:49 | | Internet access OK - project servers may be temporarily down.

This was when I click the Update button, but I find the same messages earlier in the log when BOINC did it itself.
____________

Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 372
Credit: 3,007,194
RAC: 2,108
United States
Message 1301786 - Posted: 3 Nov 2012, 19:18:46 UTC - in response to Message 1301772.

Oh my mistake, I thought you were saying you were able to downloaded 100K tasks not that you went through 100K tasks.

Apologies.
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

Profile S@NL Etienne Dokkum
Volunteer tester
Avatar
Send message
Joined: 11 Jun 99
Posts: 178
Credit: 17,685,393
RAC: 16,253
Netherlands
Message 1301789 - Posted: 3 Nov 2012, 19:22:23 UTC - in response to Message 1301764.

Thanks for looking. I have tried the no new task option twice and they are still showing in my tasks window. I understand we have a problem getting work units. I have my preferences set to take any type work unit. I am still not getting units. Is the work scheduler down or we just out of units?


scheduler is out but as this looks like a more complex problem(every attempt to contact the server without "no new tasks" bounces into a time out) it will take more from the staff then a remote reset.

So probably this will last at least until monday morning pacific time...
____________

Profile [B^S] RicketyCat
Volunteer tester
Avatar
Send message
Joined: 4 Sep 99
Posts: 13
Credit: 1,285,521
RAC: 22
United States
Message 1301790 - Posted: 3 Nov 2012, 19:22:58 UTC
Last modified: 3 Nov 2012, 19:39:57 UTC

I've aborted 6 AP task downloads. Each after they repeatedly tried to download never getting past the 1.5 KBps threshold and never past the 1.6% total down after an hour of logged download time. Got two spinning right now doing the same thing. I'd love to crunch these things, but if they never arrive I can't. Currently one of them has reached the 1.99% after 47 minutes (a new record!). If these need to be aborted as well, then I'll have to turn off AP downloads altogether until this is addressed.

I have noticed that there seems to be some relation to the transient HTTP error as each time the DL has stopped that error pops up in the log. There is no relation to reporting or uploading as I haven't crunched any type of work unit since (I think) Wednesday.

alan
Avatar
Send message
Joined: 18 Feb 00
Posts: 131
Credit: 401,606
RAC: 0
United Kingdom
Message 1301799 - Posted: 3 Nov 2012, 19:50:06 UTC

There's something local to you causing this. I've just been given another AP unit which downloaded successfully in 10 minutes.
____________

Profile [B^S] RicketyCat
Volunteer tester
Avatar
Send message
Joined: 4 Sep 99
Posts: 13
Credit: 1,285,521
RAC: 22
United States
Message 1301803 - Posted: 3 Nov 2012, 20:05:54 UTC
Last modified: 3 Nov 2012, 20:44:16 UTC

Well, just rebooted both the firewall and router and it shot up to 39 KBps and that rapidly dwindled to 3KBps. using the hosts method and declaring .21 as the go to server. tracert to both .21 and .13 reveal the transfer at berkeley is choked at 208.178.58.185 (unknown ownership) and 67.16.134.26 (a global exchange server)

[edit] After another timeout I ran another set of tracert: 64.71.140.42 didn't want to identify itself after having decent sub 20ms pings, 208.68.243.254 with 2 drops and a 59ms ping, 208.68.240.13 two lines with one dropped ping each and an average of 60ms.

[edit2] Seems to have sorted as I just got 4 AP units after aborting the two that were hanging.

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7122
Credit: 61,587,322
RAC: 16,359
Germany
Message 1301811 - Posted: 3 Nov 2012, 20:58:42 UTC
Last modified: 3 Nov 2012, 20:59:51 UTC

Just for info .. - because it was asked.

I sent an EMail to the S@h admins Dave, Eric, Matt and Jeff at Friday Nov/02 01:08 UTC.

To now no answer.

For a few minutes I sent a 2nd EMail.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
____________
BR

SETI@home Needs your Help ... $10 & U get a Star!

Team seti.international

Das Deutsche Cafe. The German Cafe.

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5486
Credit: 316,103,979
RAC: 146,210
Brazil
Message 1301815 - Posted: 3 Nov 2012, 21:06:00 UTC - in response to Message 1301811.

Just for info .. - because it was asked.

I sent an EMail to the S@h admins Dave, Eric, Matt and Jeff at Friday Nov/02 01:08 UTC.

To now no answer.

For a few minutes I sent a 2nd EMail.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *


Please ask them to stop AP spliting until they realy fix the problem, or at least make some load balancing keeping the AP WU to use no more than 30% of the bandwith. So we could refill our caches.

____________

Filipe
Send message
Joined: 12 Aug 00
Posts: 111
Credit: 4,120,429
RAC: 235
Portugal
Message 1301816 - Posted: 3 Nov 2012, 21:16:58 UTC

Almost 10 Millions Results out in the field??
____________

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8809
Credit: 62,855,353
RAC: 74,317
United Kingdom
Message 1301821 - Posted: 3 Nov 2012, 21:35:04 UTC

Nothing to do with shorties, or APs. There's a problem with the scheduler which has been going on for a few weeks, even when there are few shorties or APs around. For some reason or other the scheduler just stops responding correctly for a few hours at a time, then crawls back into life for a bit, only to go off for anther nap. Very frustrating to put it mildly.
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5942
Credit: 62,321,739
RAC: 37,089
Australia
Message 1301824 - Posted: 3 Nov 2012, 21:37:29 UTC - in response to Message 1301703.

Ok, we all know now that there is something very wrong but at least when you set to NNT you can report and empty out your cache...

Even with NNT set & only a couple of taks to report the Scheduler still usually times out.
Overnight i didin't get a single Scheduler response on either of my systems.
____________
Grant
Darwin NT.

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5942
Credit: 62,321,739
RAC: 37,089
Australia
Message 1301828 - Posted: 3 Nov 2012, 21:41:09 UTC - in response to Message 1301756.
Last modified: 3 Nov 2012, 21:43:02 UTC

They will be acknowledged as report right now only if you select "No New Tasks" on the projects tab.

I wish that were true.
I'll mention it again- even with only a couple of tasks to report & No New Tasks set the Scheduler still usually times out. Theres 1 time in about 20 where it doesn't.
Uploads have been slow for much of this time as well.
____________
Grant
Darwin NT.

Spencer
Send message
Joined: 18 Mar 00
Posts: 6
Credit: 5,444,746
RAC: 11,031
United States
Message 1301831 - Posted: 3 Nov 2012, 21:57:38 UTC

sigh.... my computers are sitting out there doing absolutely nothing

fscheel
Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1301843 - Posted: 3 Nov 2012, 22:40:49 UTC

Have one sitting here with 39 lost tasks and have nothing to crunch. been empty about 5 hours now. :(

Previous · 1 · 2 · 3 · 4 · 5 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (78) Server Problems?

Copyright © 2014 University of California