Panic Mode On (87) Server Problems?

Message boards : Number crunching : Panic Mode On (87) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 24 · Next

AuthorMessage
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1483114 - Posted: 1 Mar 2014, 7:11:38 UTC

Your other machine is fine with getting work. So Id say you have issues with just that machine.
[/quote]

Old James
ID: 1483114 · Report as offensive
Profile Robert J
Avatar

Send message
Joined: 30 Mar 00
Posts: 115
Credit: 20,087,874
RAC: 15
United States
Message 1483117 - Posted: 1 Mar 2014, 7:23:11 UTC - in response to Message 1483114.  

James,

Thanks for the reply.

I shut Boinc down and restarted, now all is good. And downloads are now successful.

No idea on how the machine got its eyes crossed.

The term "go figure" comes to mind.

Have a great weekend!

Cheers!

Robert
ID: 1483117 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1483119 - Posted: 1 Mar 2014, 7:26:25 UTC

My Vista machine every once in awhile wont connect to the internet. So I do a reboot when that happens. Must be the electrons get stuck somewhere in the ethernet card:)
Glad you solved it.
[/quote]

Old James
ID: 1483119 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1483120 - Posted: 1 Mar 2014, 7:31:48 UTC - in response to Message 1483117.  

James,

Thanks for the reply.

I shut Boinc down and restarted, now all is good. And downloads are now successful.

No idea on how the machine got its eyes crossed.

The term "go figure" comes to mind.

Have a great weekend!

Cheers!

Robert
When we have an server issue. Like the recent upload server problems. This does seem to trigger a memory leak in BOINC and then can lead to weird issues.
Often when I see issues on my machines boinc.exe is using more than 50mb of memory. Sometimes it is a client that doesn't want to do transfers anymore, or I just can no longer connect remotely to the client. So I just give the machine a reboot and then it goes back to being it's normal happy self.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1483120 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 1483139 - Posted: 1 Mar 2014, 8:39:28 UTC
Last modified: 1 Mar 2014, 8:42:30 UTC

I could be wrong, but it looks like uploads just cratered again .... and SSp is frozen as well.
ID: 1483139 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1483142 - Posted: 1 Mar 2014, 8:40:18 UTC

Ping. Clank. Grind. Ker-splat.

There goes the upload server again, and with it the SSP updates.

Yawn. Back to bed.
ID: 1483142 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1483144 - Posted: 1 Mar 2014, 8:44:48 UTC - in response to Message 1483142.  

It's becoming a bit of an issue.
Hopefully the splitters will keep chugging along so there'll be some work available when we're able to download again.
Grant
Darwin NT
ID: 1483144 · Report as offensive
Profile S@NL Etienne Dokkum
Volunteer tester
Avatar

Send message
Joined: 11 Jun 99
Posts: 212
Credit: 43,822,095
RAC: 0
Netherlands
Message 1483155 - Posted: 1 Mar 2014, 9:17:51 UTC

well, I hope they fix it with a simple reboot... otherwise with only MB's lots of folk will be on their back up projects by tonight.
ID: 1483155 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 1483170 - Posted: 1 Mar 2014, 10:02:19 UTC

I have just seen the uploads have gone again so will wait until it starts I think arounf 5 this afternoon UK time if any luck
ID: 1483170 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 9 Dec 11
Posts: 1499
Credit: 1,345,576
RAC: 0
France
Message 1483176 - Posted: 1 Mar 2014, 10:36:49 UTC - in response to Message 1483175.  

I thought that one of the benefits of moving the servers to the co-location was that it was always people there, who could deal with issues like this. Rebooting hung computers and things like that. Obviously that was not one of the services the project pays for.

Agree Sten-Arne - it seems that this is not the case :(
ID: 1483176 · Report as offensive
Juha
Volunteer tester

Send message
Joined: 7 Mar 04
Posts: 388
Credit: 1,857,738
RAC: 0
Finland
Message 1483180 - Posted: 1 Mar 2014, 10:55:16 UTC - in response to Message 1483113.  

Have noted a problem with downloading work units. Have had this difficulty with about 40 work units, this evening. Have set to NNT, for the time being.

A small snipette of the event log. Would these messages indicate a server problem or an issue with my machine?

2/28/2014 23:00:16 | SETI@home | Started download of 22my13aa.31648.1703.438086664207.12.136
2/28/2014 23:00:18 | SETI@home | Giving up on download of 22my13aa.31648.1703.438086664207.12.130: permanent HTTP error

You are running 7.2.39. That version has a bug that occasionally causes problems in communicating with servers.

I haven't seen that bug causing permanent HTTP errors but it doesn't hurt to upgrade.
ID: 1483180 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1483202 - Posted: 1 Mar 2014, 11:51:43 UTC
Last modified: 1 Mar 2014, 11:52:05 UTC

UL are down again?
ID: 1483202 · Report as offensive
Profile BladeD
Avatar

Send message
Joined: 9 Aug 11
Posts: 13320
Credit: 1,603,919
RAC: 2
United States
Message 1483204 - Posted: 1 Mar 2014, 11:59:52 UTC - in response to Message 1483202.  

UL are down again?

Yep.
ID: 1483204 · Report as offensive
Profile Michael Hoffmann
Volunteer tester

Send message
Joined: 4 Jun 08
Posts: 26
Credit: 3,284,993
RAC: 0
Germany
Message 1483205 - Posted: 1 Mar 2014, 12:04:18 UTC - in response to Message 1483204.  

UL are down again?

Yep.


That ain't cool anymore :(
Om mani padme hum.
ID: 1483205 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1483212 - Posted: 1 Mar 2014, 12:24:34 UTC

At least backup projects will be happy. ;-)

Cheers.
ID: 1483212 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1483216 - Posted: 1 Mar 2014, 12:32:11 UTC - in response to Message 1483212.  

At least backup projects will be happy. ;-)

Not necessarily. Einstein were already struggling under the load.

Note Bernd's comment about the public launch of Android having an effect: although each Android device doesn't need much bandwidth once it's crunching steadily, the initial application downloads for each new connected device add up to a significant volume of data.
ID: 1483216 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1483225 - Posted: 1 Mar 2014, 13:28:15 UTC - in response to Message 1483142.  

Ping. Clank. Grind. Ker-splat.

There goes the upload server again, and with it the SSP updates.

Yawn. Back to bed.

SSP stuck over 5 hours ago. Probably another hour before anyone in CA is awake enough to even remote in, if that's all it needs. Computer is still making contact with the scheduler, though.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1483225 · Report as offensive
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3776
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 1483229 - Posted: 1 Mar 2014, 13:38:07 UTC - in response to Message 1483225.  
Last modified: 1 Mar 2014, 13:40:40 UTC

Probably another hour before anyone in CA is awake enough to even remote in, if that's all it needs.


CoLo facility is staffed 24x7 to prevent these outages... guess it just hasn't been noticed (or someone's nodded off... lol.) I have seen outages resolve in the wee hours before so it usually works.
Edit: OK... sometimes. :^)
ID: 1483229 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1483230 - Posted: 1 Mar 2014, 13:40:52 UTC
Last modified: 1 Mar 2014, 13:43:26 UTC

Something else must be in place, to many unsheduled outages in so few time. They need to fix the problem not just restart the server andwait for the next problem apears.

Probabily what we actual need is a way to comunicate directly to the colo staff to call their atention so they could take some action to restart the server. Probabily the servers are in a set & forget configuration, with nobody cheking them.
ID: 1483230 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1483236 - Posted: 1 Mar 2014, 13:54:57 UTC

FYI, I sent E-Mail to David, Eric, Matt & Jeff - because SSP frozen & UL not possible.
ID: 1483236 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 24 · Next

Message boards : Number crunching : Panic Mode On (87) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.