Panic Mode On (67) Server problems?

Message boards : Number crunching : Panic Mode On (67) Server problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8

AuthorMessage
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1196251 - Posted: 17 Feb 2012, 7:07:50 UTC - in response to Message 1196250.  

Meow!!! Do I ever have happy kitties!
Was a bit afraid of what I'd find when I got home from work.
Very pleasantly surprised that even with the extended outage, I was able to build some cache over the day and the kibble bowls are in much better state of fill than I expected. Even with the bandwidth maxxed all day.

Well done, Seti servers and the boyz in da lab!



Looking pretty good this side of the pond.

Almost got a full cache, there is a lot of shorties in it but work is work.



Actually, the kitties luv shorties....that is, when the servers can keep up with the demand of my fast rigs doing 2 minute WUs.

RAC is suffering, as buttloads are going into pending....gonna take a while for the wingys to catch up. Not unexpected, coming back up from a standing start.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1196251 · Report as offensive
Kevin Olley

Send message
Joined: 3 Aug 99
Posts: 906
Credit: 261,085,289
RAC: 572
United Kingdom
Message 1196254 - Posted: 17 Feb 2012, 7:15:01 UTC - in response to Message 1196251.  

Actually, the kitties luv shorties....that is, when the servers can keep up with the demand of my fast rigs doing 2 minute WUs.

RAC is suffering, as buttloads are going into pending....gonna take a while for the wingys to catch up. Not unexpected, coming back up from a standing start.


Yep, nothing wrong with shorties, I would do them all day long if the pipe could keep up:-)

Pendings are climbing at a fast rate.


Kevin


ID: 1196254 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1196257 - Posted: 17 Feb 2012, 7:24:35 UTC

I do find this current level of wonderfulness perplexing.....not to complain, of course.

But here we find ourselves recovering from a long outage, AP is splitting and sending, the bandwidth is maxxed, and yet I am still able to build MB cache.

What is making the momentary situation so much better than usual confounds me quite frankly. In a very good sort of way.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1196257 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65750
Credit: 55,293,173
RAC: 49
United States
Message 1196259 - Posted: 17 Feb 2012, 7:25:38 UTC - in response to Message 1196251.  

Meow!!! Do I ever have happy kitties!
Was a bit afraid of what I'd find when I got home from work.
Very pleasantly surprised that even with the extended outage, I was able to build some cache over the day and the kibble bowls are in much better state of fill than I expected. Even with the bandwidth maxxed all day.

Well done, Seti servers and the boyz in da lab!



Looking pretty good this side of the pond.

Almost got a full cache, there is a lot of shorties in it but work is work.



Actually, the kitties luv shorties....that is, when the servers can keep up with the demand of my fast rigs doing 2 minute WUs.

RAC is suffering, as buttloads are going into pending....gonna take a while for the wingys to catch up. Not unexpected, coming back up from a standing start.

Yeah We took a bit of a drop and then I added a 5th gpu today too.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1196259 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1196293 - Posted: 17 Feb 2012, 9:26:46 UTC - in response to Message 1195263.  

Hi Hal,
Seems work is flowing again now, but I'm getting a load of 'lost WU' reported..

17/02/2012 08:52:12 | SETI@home | Started upload of 21my11ab.4602.6611.4.10.133_0_0
17/02/2012 08:52:14 | SETI@home | Scheduler request completed: got 20 new tasks

Had loads of 20/27/40 WU sent to my rig..

But also the following..

17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.21603.9238.10.10.132_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.21603.9238.10.10.133_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 23my11ad.23287.6611.7.10.38_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.107_0
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.21603.9238.10.10.136_0
17/02/2012 08:52:14 | SETI@home | Resent lost task 23my11ad.23287.6611.7.10.37_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.110_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 23my11ad.23287.6611.7.10.41_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.118_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.123_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.120_1

Any ideas whats up here?

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1196293 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1196295 - Posted: 17 Feb 2012, 9:31:38 UTC - in response to Message 1196293.  

Hi Hal,
Seems work is flowing again now, but I'm getting a load of 'lost WU' reported..

17/02/2012 08:52:12 | SETI@home | Started upload of 21my11ab.4602.6611.4.10.133_0_0
17/02/2012 08:52:14 | SETI@home | Scheduler request completed: got 20 new tasks

Had loads of 20/27/40 WU sent to my rig..

But also the following..

17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.21603.9238.10.10.132_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.21603.9238.10.10.133_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 23my11ad.23287.6611.7.10.38_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.107_0
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.21603.9238.10.10.136_0
17/02/2012 08:52:14 | SETI@home | Resent lost task 23my11ad.23287.6611.7.10.37_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.110_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 23my11ad.23287.6611.7.10.41_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.118_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.123_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.120_1

Any ideas whats up here?

Regards,

No problem on your part.......
The servers and the bandwidth are both stuffed with activity. If a work request fails due to comms problems, the work has been issued to you by the servers, but your computer never got the message. So, on the next good try the servers know your computers have not gotten the WUs, and so, the resends.


"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1196295 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1196304 - Posted: 17 Feb 2012, 10:24:02 UTC - in response to Message 1196295.  

Hi Hal,
Thanks for the info... Comms problems abound and its not IMHO all due to everyone and his/her cat trying to obtain WU, did a test from UK to USA[San Franciso] the other day.. result stuttering and delays lost packets etc..

Methinks somewhere between London and Berkeley the pipe is duff:-(

How it is for users in the USA I have no idea, but I suspect there will be problems between folks on the west coast and the east of the USA..

Regards,

Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1196304 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1196305 - Posted: 17 Feb 2012, 10:37:20 UTC - in response to Message 1196304.  

Hi Hal,
Thanks for the info... Comms problems abound and its not IMHO all due to everyone and his/her cat trying to obtain WU, did a test from UK to USA[San Franciso] the other day.. result stuttering and delays lost packets etc..

Methinks somewhere between London and Berkeley the pipe is duff:-(

How it is for users in the USA I have no idea, but I suspect there will be problems between folks on the west coast and the east of the USA..

Regards,

ISP and the routes that are taken to get from point A to point B make all the difference. Some people here in the US, and even on the west coast have issues, whilst others have no problems at all. I have yet to have any of the dropped-packet/routing issues that have plagued the [mostly] international users. The only dropped packets I get are when the pipe is saturated, but everybody suffers packet drop at that point.

Presently, the only issue I have is that the .18 download server has a lot of HTTP errors and time-outs. IF you can get a transfer going, it goes less than 2KB/sec. .13 is the good one.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1196305 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1196307 - Posted: 17 Feb 2012, 10:52:13 UTC - in response to Message 1196305.  


so many shorties so little time ...
ID: 1196307 · Report as offensive
LadyL
Volunteer tester
Avatar

Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1196309 - Posted: 17 Feb 2012, 10:59:22 UTC - in response to Message 1196210.  


Edit:
and i m still manually reporting them from time to time :P
2012-02-16 23:01:10 | SETI@home | update requested by user
2012-02-16 23:01:14 | SETI@home | Sending scheduler request: Requested by user.
2012-02-16 23:01:14 | SETI@home | Reporting 5 completed tasks, not requesting new tasks
2012-02-16 23:01:18 | SETI@home | Scheduler request completed


with 'not requesting new tasks' the question to ask is 'why?'.
There can be several reasons, it's not always the stuck downloads from the excessive project backoffs on 6.12.34 (and we [i.e. the BOINC alpha testing community that crunches SETI] have tried to convince DA several times to change that stuid algorithm again, he's being very stubborn about reverting changes he did...)
Next time, better try hunting it down.
ID: 1196309 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1196329 - Posted: 17 Feb 2012, 12:53:09 UTC - in response to Message 1196305.  
Last modified: 17 Feb 2012, 12:55:06 UTC

Hi,

quote/ Presently, the only issue I have is that the .18 download server has a lot of HTTP errors and time-outs. IF you can get a transfer going, it goes less than 2KB/sec. .13 is the good one./unquote

Humm, I'm getting 74KB/s uploads, and anything from 0.42 to 34KB/s d/loads, with endless retrys and project backoff's.
On one 50 WU d/l I had more WU with between 0.01% and 98.nn% in the pipeline than ones that had yet to be attempted..
Having to kick the d/l's backside is a wee bit of a pain in the rear end:-)

Notice a load of HTTP errors reported that are causing I 'think' the endless retry for 'nn min nn sec' delays and those 'nn mins' vary from 5 mins to 5 hours... A 5 hour retry delay is plain daft. On the quantity of WU being sent its quite possible that its well over 24 hrs before that actual WU is actually recieved, let alone all the rest, and on some batches there are 75% or more in retry mode..

Damm strange all the uploads are damm fast..

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1196329 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1196333 - Posted: 17 Feb 2012, 13:46:51 UTC - in response to Message 1196305.  

Presently, the only issue I have is that the .18 download server has a lot of HTTP errors and time-outs. IF you can get a transfer going, it goes less than 2KB/sec. .13 is the good one.

"208.68.240.13 boinc2.ssl.berkeley.edu" in the hosts file helped indeed a lot.
ID: 1196333 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65750
Credit: 55,293,173
RAC: 49
United States
Message 1196340 - Posted: 17 Feb 2012, 14:41:01 UTC - in response to Message 1196293.  

Hi Hal,
Seems work is flowing again now, but I'm getting a load of 'lost WU' reported..

17/02/2012 08:52:12 | SETI@home | Started upload of 21my11ab.4602.6611.4.10.133_0_0
17/02/2012 08:52:14 | SETI@home | Scheduler request completed: got 20 new tasks

Had loads of 20/27/40 WU sent to my rig..

But also the following..

17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.21603.9238.10.10.132_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.21603.9238.10.10.133_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 23my11ad.23287.6611.7.10.38_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.107_0
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.21603.9238.10.10.136_0
17/02/2012 08:52:14 | SETI@home | Resent lost task 23my11ad.23287.6611.7.10.37_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.110_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 23my11ad.23287.6611.7.10.41_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.118_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.123_1
17/02/2012 08:52:14 | SETI@home | Resent lost task 21jn11af.26568.2934.12.10.120_1

Any ideas whats up here?

Regards,

Yeah Ya just got a few GHOSTS to deal with, Me I have that and two of My cards evidently need cleaning out... Dust Bunnies
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1196340 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1196351 - Posted: 17 Feb 2012, 15:22:42 UTC - in response to Message 1196333.  

"208.68.240.13 boinc2.ssl.berkeley.edu" in the hosts file helped indeed a lot.

Pardon my ignorance, but WTH is the "hosts" file located ? I can't find it in any BOINC directory.

T.A. :-S

ID: 1196351 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1196354 - Posted: 17 Feb 2012, 15:34:58 UTC - in response to Message 1196351.  

"208.68.240.13 boinc2.ssl.berkeley.edu" in the hosts file helped indeed a lot.

Pardon my ignorance, but WTH is the "hosts" file located ? I can't find it in any BOINC directory.

T.A. :-S

It's a standard operating system file. For Windows (all versions, up to and including 64-bit Windows 7), it's in

C:\Windows\System32\drivers\etc\

For Win7 and (I suspect) Vista, you'll need to run your text editor - notepad will do - in Administrator mode to overcome system file protection.
ID: 1196354 · Report as offensive
Profile Lint trap

Send message
Joined: 30 May 03
Posts: 871
Credit: 28,092,319
RAC: 0
United States
Message 1196357 - Posted: 17 Feb 2012, 15:43:39 UTC - in response to Message 1196351.  
Last modified: 17 Feb 2012, 16:10:42 UTC

In XP x86 it is in the Windows\system32\drivers\etc directory.

My hosts looks like this:

# Copyright (c) 1993-1999 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
#      102.54.94.97     rhino.acme.com          # source server
#       38.25.63.10     x.acme.com              # x client host

127.0.0.1       localhost

208.68.240.13	boinc2.ssl.berkeley.edu		# S@H Download Server
208.68.240.18	boinc2.ssl.berkeley.edu		# S@H Download Server
208.68.240.16	setiboincdata.ssl.berkeley.edu	# S@H Upload Server
208.68.240.20	setiboinc.ssl.berkeley.edu	# S@H Scheduling Server


You also want to be sure to have the 'fixed' libcurl.dll in BOINC' main directory.

Version 7.19.7.0 and dated 12/3/09, or newer. IIRC, the broken libcurl.dll file does not properly switch between host locations when one of the download server ip addresses doesn't work.

Lt
edited...
ID: 1196357 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1196358 - Posted: 17 Feb 2012, 15:53:38 UTC - in response to Message 1196345.  

Presently, the only issue I have is that the .18 download server has a lot of HTTP errors and time-outs. IF you can get a transfer going, it goes less than 2KB/sec. .13 is the good one.

"208.68.240.13 boinc2.ssl.berkeley.edu" in the hosts file helped indeed a lot.


do we need to reboot after that ? or reboot the BOINC ?

No, you can however ping boinc2.ssl.berkeley.edu to check if it's working before you click on retry in BOINC.
ID: 1196358 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1196377 - Posted: 17 Feb 2012, 16:55:15 UTC - in response to Message 1196340.  

Hi VWB,
Those 'lost' WU are becomming a problem, I've had about 7 resends of the same WU's dispite the fact I already have them on my rig. They are chewing up my ISP's data d/l quota with the endless resends of the exact same WU.
The amount of space used on my HDD is static since these WU's simply overwrite the already there WU's

Dont need ISP quoting 'fair use' at me as well.

Someone at S@H HQ needs to sort out the handshaking problem Boinc and the servers seem to have.

Dustbunnies:-) Had them shorting out the switch on my main rig PSU had to
shell out £80 odd for a new one.

Wasnt a total loss tho.. Cleaned it out and used it to upgrade another rig with a below 400 watt psu..

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1196377 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1196384 - Posted: 17 Feb 2012, 17:10:29 UTC - in response to Message 1196354.  

.....For Windows (all versions, up to and including 64-bit Windows 7), it's in

C:\Windows\System32\drivers\etc\

Thankyou Richard and Lint trap.

I've added the addresses to the file. I'm still getting back offs but it seems to have made a difference. Download speed has definitely improved.

T.A.
ID: 1196384 · Report as offensive
Previous · 1 . . . 5 · 6 · 7 · 8

Message boards : Number crunching : Panic Mode On (67) Server problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.