Panic Mode On (28) Server problems

Message boards : Number crunching : Panic Mode On (28) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 16 · Next

AuthorMessage
_heinz
Volunteer tester

Send message
Joined: 25 Feb 05
Posts: 744
Credit: 5,539,270
RAC: 0
France
Message 971123 - Posted: 18 Feb 2010, 7:52:48 UTC

hi,
the same from here, cant no upload work
18.02.2010 08:51:05 SETI@home Started upload of 15fe07ac.10146.290720.11.10.4_0_0
18.02.2010 08:51:26 SETI@home Temporarily failed upload of 15fe07ac.10146.290720.11.10.4_0_0: HTTP error
18.02.2010 08:51:26 SETI@home Backing off 44 min 32 sec on upload of 15fe07ac.10146.290720.11.10.4_0_0

D5400XS V8-Xeon
ID: 971123 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 971125 - Posted: 18 Feb 2010, 7:55:55 UTC - in response to Message 971123.  
Last modified: 18 Feb 2010, 7:56:09 UTC

hi,
the same from here, cant no upload work
18.02.2010 08:51:05 SETI@home Started upload of 15fe07ac.10146.290720.11.10.4_0_0
18.02.2010 08:51:26 SETI@home Temporarily failed upload of 15fe07ac.10146.290720.11.10.4_0_0: HTTP error
18.02.2010 08:51:26 SETI@home Backing off 44 min 32 sec on upload of 15fe07ac.10146.290720.11.10.4_0_0

And yet, the Cricket graphs show total bandwidth at a very lazy pace....does not make sense to me. Have never seen quite this scenario that I can remember.

These kinds of comms problems are usually seen when the bandwidth is saturated.

Maybe it will get sorted tomorrow. Until then, I got two of my prime rigs with no Cuda work. Sigh.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 971125 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 971142 - Posted: 18 Feb 2010, 9:35:53 UTC - in response to Message 971125.  

Hi, still got a day, I guess, of work for SETI, but also can't upload about 100 task's.
Probably a good time, to clean out the dust in all rig's and reseat a heatsink which doesn't make much contact with the CPU, as it get's too hot (96C) and only has a mild (10%) OC.
BOINC doesn't even try to UPLoad, since yesterday!

ID: 971142 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 971143 - Posted: 18 Feb 2010, 9:41:14 UTC
Last modified: 18 Feb 2010, 9:43:21 UTC

I still wish answers as to just WTF is going on.......

Seti server problems, or external inet comms problems.

These have both bean postulated, but not answered.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 971143 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14676
Credit: 200,643,578
RAC: 874
United Kingdom
Message 971146 - Posted: 18 Feb 2010, 9:50:17 UTC - in response to Message 971068.  

Pappa wrote:

This recovery will take a while! Go read the Tech News!

All things considered it is doing okay.

Regards

No it isn't.

The sequence was:

1) Uploads stopped (problem with HTTP service on Bruno)
2) Regular backup/compression maintenance
3) A/C failure

They've dealt with (3): (2) is history, as it always is by the following day: but reading Matt's posts, I honestly don't think he's noticed (1) yet. First day back after vacation, everything showing green on the monitors, just enough uploads trickling in to make it look like a normal/quiet day.

I'm running a very lean mix at the moment, for testing. CUDA cards switched to GPUGrid ages ago, and even big steady old Octo has finished the 2-day CPU cache. So all I'm doing at the moment is grinding through a backlog of rebranded VLARs on CPUs. But I've still got 303 uploads waiting to go through - about 20 more than when I went to bed last night.

If the uploader can't even keep up with CPU production, it's broken: I tried to say as much in Technical News last night, but it got buried in an avalanche of 15-year old UPSs. Somebody else can try using back-channels to attract attention if they want.
ID: 971146 · Report as offensive
Dave

Send message
Joined: 29 Mar 02
Posts: 778
Credit: 25,001,396
RAC: 0
United Kingdom
Message 971147 - Posted: 18 Feb 2010, 9:53:29 UTC

I'm out of work on this main machine for the 1st time since I started 6 mnths ago. Secondary machine has about an hour of work to go.
ID: 971147 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 971148 - Posted: 18 Feb 2010, 9:54:06 UTC
Last modified: 18 Feb 2010, 9:56:13 UTC

There is a basic comms problem going on here..........

It may be within Seti, it may be not.


All I know, is it is starting to piss me off. LOL.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 971148 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 971162 - Posted: 18 Feb 2010, 12:51:14 UTC

Monitoring my upload process, I see a very few making it through at present. What is frustrating is that I see a lot that get as far as 100% uploaded to be subsequently rejected and queued up to try again. The last bit of handshaking fails and causes the system to repeat work (upload) that appears to have been completed. This is not an new observation.

Because it obviously takes bandwidth and server resources to execute this type of failure, and because the behavior has been around 'forever', has any effort been made to remedy it? (Doning manager hat: If not why not?!)
ID: 971162 · Report as offensive
Profile the silver surfer
Avatar

Send message
Joined: 24 Feb 01
Posts: 131
Credit: 3,739,307
RAC: 0
Austria
Message 971164 - Posted: 18 Feb 2010, 12:57:40 UTC - in response to Message 971142.  

Hi, still got a day, I guess, of work for SETI, but also can't upload about 100 task's.
Probably a good time, to clean out the dust in all rig's and reseat a heatsink which doesn't make much contact with the CPU, as it get's too hot (96C) and only has a mild (10%) OC.
BOINC doesn't even try to UPLoad, since yesterday!



Hi Fred,

To be correct, it isn`t BOINC, it`s SETI. I`m on some other projects right now,
and uploads/downloads are doing just fine.

All the best !

Kurt

ID: 971164 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 971165 - Posted: 18 Feb 2010, 13:00:14 UTC - in response to Message 971162.  
Last modified: 18 Feb 2010, 13:33:09 UTC

Monitoring my upload process, I see a very few making it through at present. What is frustrating is that I see a lot that get as far as 100% uploaded to be subsequently rejected and queued up to try again. The last bit of handshaking fails and causes the system to repeat work (upload) that appears to have been completed. This is not an new observation.

Because it obviously takes bandwidth and server resources to execute this type of failure, and because the behavior has been around 'forever', has any effort been made to remedy it? (Doning manager hat: If not why not?!)

All comms are wrecked here for now too.......
[rant]This is rather frustrating, as no acknowledgment has been made of the problem as of yet.......so I don't know WTF is going on. Nothing is happening on Cricket, nor the server status page to give a clue...
WHERE ARE THE FISH?????[/rant]
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 971165 · Report as offensive
Profile Bill Walker
Avatar

Send message
Joined: 4 Sep 99
Posts: 3868
Credit: 2,697,267
RAC: 0
Canada
Message 971171 - Posted: 18 Feb 2010, 13:29:43 UTC

I'm actually out of SETI work, for the first time in months.

Uploaded 5 out of 8 completed WUs overnight, oldest of these these have been trying to transfer for about 48 hours. No downloads, messages like this:

18/02/2010 8:26:04 AM|SETI@home|Sending scheduler request: Requested by user. Requesting 259200 seconds of work, reporting 1 completed tasks
18/02/2010 8:26:26 AM||Project communication failed: attempting access to reference site
18/02/2010 8:26:27 AM||Internet access OK - project servers may be temporarily down.
18/02/2010 8:26:29 AM|SETI@home|Scheduler request failed: Couldn't connect to server

Cricket graphs appear virtually dead. Definitely a communications problem somewhere....

Fortunately all my other projects are running nicely.

ID: 971171 · Report as offensive
Profile ccappel
Avatar

Send message
Joined: 27 Jan 00
Posts: 362
Credit: 1,516,412
RAC: 0
United States
Message 971176 - Posted: 18 Feb 2010, 13:40:28 UTC

After many many retry attempts (automatic, not manual), I finally have all my uploads completed and tasks reported. Haven't downloaded anything for a while, but thank god for the 10 day cache. Considering myself lucky.
"Life is a tragedy for those who feel, and a comedy for those who think."

"I never get into an argument that I cannot win."
ID: 971176 · Report as offensive
Profile Spectrum
Avatar

Send message
Joined: 14 Jun 99
Posts: 468
Credit: 53,129,336
RAC: 0
Australia
Message 971178 - Posted: 18 Feb 2010, 13:47:47 UTC - in response to Message 971165.  

Same here in the land of OZ, 1 or 2 wu's upload but no luck updating, calm down and go roll in some catnip lol
ID: 971178 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 971180 - Posted: 18 Feb 2010, 13:53:21 UTC - in response to Message 971178.  
Last modified: 18 Feb 2010, 13:58:11 UTC

Same here in the land of OZ, 1 or 2 wu's upload but no luck updating, calm down and go roll in some catnip lol

Catnip, hell......I wanna CRUNCH something.... LOL.
But uploading and reporting would do for now.

What I wanna know is just where the problem is....and is it gonna be addressed soon?
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 971180 · Report as offensive
Labriskus

Send message
Joined: 20 Feb 07
Posts: 11
Credit: 745,920
RAC: 0
United Kingdom
Message 971183 - Posted: 18 Feb 2010, 14:00:28 UTC
Last modified: 18 Feb 2010, 14:01:42 UTC

Erhm not sure if anyone noticed the news page;

Projects are down due to a server closet air conditioning failure.
We have to power down most of our computers until this is fixed. 17 Feb 2010 2:36:55 UTC


http://setiathome.berkeley.edu/index.php


Apolagies if this has allready been pointed out but this could be whats going on :)
ID: 971183 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 971184 - Posted: 18 Feb 2010, 14:04:20 UTC - in response to Message 971183.  

Erhm not sure if anyone noticed the news page;

Projects are down due to a server closet air conditioning failure.
We have to power down most of our computers until this is fixed. 17 Feb 2010 2:36:55 UTC


http://setiathome.berkeley.edu/index.php


Apolagies if this has allready been pointed out but this could be whats going on :)

My friend, if the AC had not been fixed, we would not be talking right now.......the servers would still be down.

There is a comms problem that existed before the AC failure, and still persists.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 971184 · Report as offensive
Labriskus

Send message
Joined: 20 Feb 07
Posts: 11
Credit: 745,920
RAC: 0
United Kingdom
Message 971185 - Posted: 18 Feb 2010, 14:11:42 UTC - in response to Message 971184.  

Erhm not sure if anyone noticed the news page;

Projects are down due to a server closet air conditioning failure.
We have to power down most of our computers until this is fixed. 17 Feb 2010 2:36:55 UTC


http://setiathome.berkeley.edu/index.php


Apolagies if this has allready been pointed out but this could be whats going on :)

My friend, if the AC had not been fixed, we would not be talking right now.......the servers would still be down.

There is a comms problem that existed before the AC failure, and still persists.



I see ........ then i stand corrected :)

Still its a nice chance to give the pc a clean :)
ID: 971185 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 971186 - Posted: 18 Feb 2010, 14:13:16 UTC - in response to Message 971185.  

Erhm not sure if anyone noticed the news page;

Projects are down due to a server closet air conditioning failure.
We have to power down most of our computers until this is fixed. 17 Feb 2010 2:36:55 UTC


http://setiathome.berkeley.edu/index.php


Apolagies if this has allready been pointed out but this could be whats going on :)

My friend, if the AC had not been fixed, we would not be talking right now.......the servers would still be down.

There is a comms problem that existed before the AC failure, and still persists.



I see ........ then i stand corrected :)

Still its a nice chance to give the pc a clean :)

I suspect many dust bunnies are meeting their maker about now.

"Time is simply the mechanism that keeps everything from happening all at once."

ID: 971186 · Report as offensive
jangliss

Send message
Joined: 14 Jan 04
Posts: 1
Credit: 7,544
RAC: 0
United States
Message 971189 - Posted: 18 Feb 2010, 14:14:55 UTC - in response to Message 971184.  

My friend, if the AC had not been fixed, we would not be talking right now.......the servers would still be down.


You'd think they'd update the front page to reflect that the A/C is back up again.


There is a comms problem that existed before the AC failure, and still persists.


Weird that half the servers are still reporting "OK", and yet I've seen numerous complaints about not being able to report updates. All I keep getting is "rety in {some time}". I thought it might have been my connection, but tested from 3 different locations, still get it. Sniffing into the traffic, the server is throwing a 500 error when it reports the data. I'd be more than happy to post my traffic if it helps anybody resolve the issues.
ID: 971189 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 971190 - Posted: 18 Feb 2010, 14:23:17 UTC - in response to Message 971189.  

My friend, if the AC had not been fixed, we would not be talking right now.......the servers would still be down.


You'd think they'd update the front page to reflect that the A/C is back up again.


There is a comms problem that existed before the AC failure, and still persists.


Weird that half the servers are still reporting "OK", and yet I've seen numerous complaints about not being able to report updates. All I keep getting is "rety in {some time}". I thought it might have been my connection, but tested from 3 different locations, still get it. Sniffing into the traffic, the server is throwing a 500 error when it reports the data. I'd be more than happy to post my traffic if it helps anybody resolve the issues.

That's the whole durned point.....
I have not had anybody confirm or deny whether this problem is within the Seti servers or is with their connection to the outside world.

I am gonna go to sleep right now, should I not freeze because the Cuda cards aren't throwing off much heat at the moment, and hope when I wake up it was all just a bad kitty dream.

WHERE ARE THE FISH???

"Time is simply the mechanism that keeps everything from happening all at once."

ID: 971190 · Report as offensive
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 16 · Next

Message boards : Number crunching : Panic Mode On (28) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.