Panic Mode On (78) Server Problems?

Message boards : Number crunching : Panic Mode On (78) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 22 · Next

AuthorMessage
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1301757 - Posted: 3 Nov 2012, 18:10:01 UTC

Whoa there, kibbles........
Down about 100K tasks since the bandwidth got tied tight.

And that's with 9 kitties asking for kibble every chance they get.

Kibble bowls still full here, but begeejus.

Trouble in kibble land.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1301757 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1301759 - Posted: 3 Nov 2012, 18:16:54 UTC - in response to Message 1301757.  

That's good for you msattler but the last time I got tasks actually downloaded to my cruncher was over a day ago. Sure I've got a load of new tasks assigned to me but they are "ghosts", server assigned but not actually downloaded.
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1301759 · Report as offensive
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 1301763 - Posted: 3 Nov 2012, 18:22:49 UTC

I feel I should ask my question again.

Does anyone know if the SETI staff is aware of this problem? It doesn't look like it would be easy for them to spot the trouble from their end right away. If that's true, then perhaps someone should make sure they know.
ID: 1301763 · Report as offensive
Profile Dannis
Avatar

Send message
Joined: 29 Jan 06
Posts: 24
Credit: 11,065,028
RAC: 7
United States
Message 1301764 - Posted: 3 Nov 2012, 18:23:43 UTC - in response to Message 1301756.  

Thanks for looking. I have tried the no new task option twice and they are still showing in my tasks window. I understand we have a problem getting work units. I have my preferences set to take any type work unit. I am still not getting units. Is the work scheduler down or we just out of units?
ID: 1301764 · Report as offensive
sonicthe

Send message
Joined: 26 Oct 00
Posts: 2
Credit: 612,921
RAC: 0
United States
Message 1301765 - Posted: 3 Nov 2012, 18:24:43 UTC

Keith,

I've got the same indications you have. The web says I have 31 "In Progress", but none of them have actually downloaded, and I was able to clear the "Ready to Report" list by updating with NNT selected.

I still get this in my event log:
11/03/12 14:02:19 | SETI@home | Requesting new tasks for CPU
11/03/12 14:07:45 | SETI@home | Scheduler request failed: Timeout was reached
11/03/12 14:07:47 | | Project communication failed: attempting access to reference site
11/03/12 14:07:49 | | Internet access OK - project servers may be temporarily down.

This was when I click the Update button, but I find the same messages earlier in the log when BOINC did it itself.
ID: 1301765 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1301772 - Posted: 3 Nov 2012, 18:35:21 UTC - in response to Message 1301759.  

That's good for you msattler but the last time I got tasks actually downloaded to my cruncher was over a day ago. Sure I've got a load of new tasks assigned to me but they are "ghosts", server assigned but not actually downloaded.

The kitties loaded up during the good times last week, so they are good to go for a bit.
Bad thing is......
Looks like the AP splitting and MB splitting are in lockstep right now.
Not a good thingy. At least not for us MB workers. I did apparently get about 6 AP tasks in the last 24 hours.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1301772 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1301786 - Posted: 3 Nov 2012, 19:18:46 UTC - in response to Message 1301772.  

Oh my mistake, I thought you were saying you were able to downloaded 100K tasks not that you went through 100K tasks.

Apologies.
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1301786 · Report as offensive
Profile S@NL Etienne Dokkum
Volunteer tester
Avatar

Send message
Joined: 11 Jun 99
Posts: 212
Credit: 43,822,095
RAC: 0
Netherlands
Message 1301789 - Posted: 3 Nov 2012, 19:22:23 UTC - in response to Message 1301764.  

Thanks for looking. I have tried the no new task option twice and they are still showing in my tasks window. I understand we have a problem getting work units. I have my preferences set to take any type work unit. I am still not getting units. Is the work scheduler down or we just out of units?


scheduler is out but as this looks like a more complex problem(every attempt to contact the server without "no new tasks" bounces into a time out) it will take more from the staff then a remote reset.

So probably this will last at least until monday morning pacific time...
ID: 1301789 · Report as offensive
Profile [B^S] RicketyCat
Volunteer tester
Avatar

Send message
Joined: 4 Sep 99
Posts: 13
Credit: 1,326,046
RAC: 0
United States
Message 1301790 - Posted: 3 Nov 2012, 19:22:58 UTC
Last modified: 3 Nov 2012, 19:39:57 UTC

I've aborted 6 AP task downloads. Each after they repeatedly tried to download never getting past the 1.5 KBps threshold and never past the 1.6% total down after an hour of logged download time. Got two spinning right now doing the same thing. I'd love to crunch these things, but if they never arrive I can't. Currently one of them has reached the 1.99% after 47 minutes (a new record!). If these need to be aborted as well, then I'll have to turn off AP downloads altogether until this is addressed.

I have noticed that there seems to be some relation to the transient HTTP error as each time the DL has stopped that error pops up in the log. There is no relation to reporting or uploading as I haven't crunched any type of work unit since (I think) Wednesday.
ID: 1301790 · Report as offensive
alan
Avatar

Send message
Joined: 18 Feb 00
Posts: 131
Credit: 401,606
RAC: 0
United Kingdom
Message 1301799 - Posted: 3 Nov 2012, 19:50:06 UTC

There's something local to you causing this. I've just been given another AP unit which downloaded successfully in 10 minutes.
ID: 1301799 · Report as offensive
Profile [B^S] RicketyCat
Volunteer tester
Avatar

Send message
Joined: 4 Sep 99
Posts: 13
Credit: 1,326,046
RAC: 0
United States
Message 1301803 - Posted: 3 Nov 2012, 20:05:54 UTC
Last modified: 3 Nov 2012, 20:44:16 UTC

Well, just rebooted both the firewall and router and it shot up to 39 KBps and that rapidly dwindled to 3KBps. using the hosts method and declaring .21 as the go to server. tracert to both .21 and .13 reveal the transfer at berkeley is choked at 208.178.58.185 (unknown ownership) and 67.16.134.26 (a global exchange server)

[edit] After another timeout I ran another set of tracert: 64.71.140.42 didn't want to identify itself after having decent sub 20ms pings, 208.68.243.254 with 2 drops and a 59ms ping, 208.68.240.13 two lines with one dropped ping each and an average of 60ms.

[edit2] Seems to have sorted as I just got 4 AP units after aborting the two that were hanging.
ID: 1301803 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1301811 - Posted: 3 Nov 2012, 20:58:42 UTC
Last modified: 3 Nov 2012, 20:59:51 UTC

Just for info .. - because it was asked.

I sent an EMail to the S@h admins Dave, Eric, Matt and Jeff at Friday Nov/02 01:08 UTC.

To now no answer.

For a few minutes I sent a 2nd EMail.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
ID: 1301811 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1301815 - Posted: 3 Nov 2012, 21:06:00 UTC - in response to Message 1301811.  

Just for info .. - because it was asked.

I sent an EMail to the S@h admins Dave, Eric, Matt and Jeff at Friday Nov/02 01:08 UTC.

To now no answer.

For a few minutes I sent a 2nd EMail.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *


Please ask them to stop AP spliting until they realy fix the problem, or at least make some load balancing keeping the AP WU to use no more than 30% of the bandwith. So we could refill our caches.

ID: 1301815 · Report as offensive
Filipe

Send message
Joined: 12 Aug 00
Posts: 218
Credit: 21,281,677
RAC: 20
Portugal
Message 1301816 - Posted: 3 Nov 2012, 21:16:58 UTC

Almost 10 Millions Results out in the field??
ID: 1301816 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22149
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1301821 - Posted: 3 Nov 2012, 21:35:04 UTC

Nothing to do with shorties, or APs. There's a problem with the scheduler which has been going on for a few weeks, even when there are few shorties or APs around. For some reason or other the scheduler just stops responding correctly for a few hours at a time, then crawls back into life for a bit, only to go off for anther nap. Very frustrating to put it mildly.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1301821 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1301824 - Posted: 3 Nov 2012, 21:37:29 UTC - in response to Message 1301703.  

Ok, we all know now that there is something very wrong but at least when you set to NNT you can report and empty out your cache...

Even with NNT set & only a couple of taks to report the Scheduler still usually times out.
Overnight i didin't get a single Scheduler response on either of my systems.
Grant
Darwin NT
ID: 1301824 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1301828 - Posted: 3 Nov 2012, 21:41:09 UTC - in response to Message 1301756.  
Last modified: 3 Nov 2012, 21:43:02 UTC

They will be acknowledged as report right now only if you select "No New Tasks" on the projects tab.

I wish that were true.
I'll mention it again- even with only a couple of tasks to report & No New Tasks set the Scheduler still usually times out. Theres 1 time in about 20 where it doesn't.
Uploads have been slow for much of this time as well.
Grant
Darwin NT
ID: 1301828 · Report as offensive
Spencer

Send message
Joined: 18 Mar 00
Posts: 6
Credit: 11,264,019
RAC: 0
United States
Message 1301831 - Posted: 3 Nov 2012, 21:57:38 UTC

sigh.... my computers are sitting out there doing absolutely nothing
ID: 1301831 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1301843 - Posted: 3 Nov 2012, 22:40:49 UTC

Have one sitting here with 39 lost tasks and have nothing to crunch. been empty about 5 hours now. :(
ID: 1301843 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1301858 - Posted: 3 Nov 2012, 23:21:25 UTC

Still crunching......
But many rigs are on their 2 and 3 hour comms timeouts.......not pretty.

Something other than just bandwidth saturation is at play here. And it's not playing well with others.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1301858 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (78) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.