Panic Mode On (9) Server problems

Message boards : Number crunching : Panic Mode On (9) Server problems
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 9 · Next

AuthorMessage
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 808652 - Posted: 15 Sep 2008, 22:28:43 UTC

New thread to continue server problems.

ID: 808652 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 808656 - Posted: 15 Sep 2008, 22:34:20 UTC - in response to Message 808652.  

New thread to continue server problems.

Now why would we need that?

There aren't any continuing server problems to panic about, are there? ;-)
ID: 808656 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 808694 - Posted: 15 Sep 2008, 23:57:46 UTC

No, they just come and go.
ID: 808694 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 808795 - Posted: 16 Sep 2008, 6:27:36 UTC

Well I am getting no new work, since I switched on an hour ago
ID: 808795 · Report as offensive
Profile Dr. C.E.T.I.
Avatar

Send message
Joined: 29 Feb 00
Posts: 16019
Credit: 794,685
RAC: 0
United States
Message 808796 - Posted: 16 Sep 2008, 6:30:13 UTC - in response to Message 808795.  


Well I am getting no new work, since I switched on an hour ago



Results ready to send 1 0m
Current result creation rate 22.90/sec 0m
Results out in the field 3,265,650 0m
Results received in last hour 45,294 0m
Result turnaround time (last hour average) 66.87 hours 0m
Results returned and awaiting validation 2,810,170 0m
Workunits waiting for validation 1 0m
Workunits waiting for assimilation 129,612 0m
Workunit files waiting for deletion 21 0m
Result files waiting for deletion 63 0m
Workunits waiting for db purging 539,953 0m
Results waiting for db purging 1,126,633 0m
Transitioner backlog (hours) 0 0m

;) go get iT ;)


BOINC Wiki . . .

Science Status Page . . .
ID: 808796 · Report as offensive
6dj72cn8

Send message
Joined: 3 Sep 99
Posts: 24
Credit: 163,811
RAC: 0
Australia
Message 808813 - Posted: 16 Sep 2008, 9:24:03 UTC - in response to Message 808652.  

New thread to continue server problems.

It would save bandwidth if we had a thread for when there aren't server problems!

ID: 808813 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 808851 - Posted: 16 Sep 2008, 15:56:33 UTC - in response to Message 808813.  

New thread to continue server problems.

It would save bandwidth if we had a thread for when there aren't server problems!


It would save bandwidth if people didn't panic.

The BOINC client is designed to keep crunching and talk to the project when the project servers are available.
ID: 808851 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 808903 - Posted: 16 Sep 2008, 22:48:02 UTC - in response to Message 808851.  

It would save bandwidth if people didn't panic.

The BOINC client is designed to keep crunching and talk to the project when the project servers are available.


WORD!!

ID: 808903 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 809505 - Posted: 18 Sep 2008, 15:37:44 UTC

Last Sunday I posted in the (8) version of this thread, in part:
There's a situation on the Server status page which I've been wondering about. Before and after the difficulties last Thursday (Sept. 11) the status showed the first two channels on 22mr08aa being active, and the first one on 23mr08aa. That hasn't changed, indicating those three splitter processes aren't producing any WUs.

The Tuesday outage modified that, there's only one 'active' on each now. But there still hasn't been any progress past the first channel on those. It's been at least a full week now...
                                                                 Joe
ID: 809505 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51540
Credit: 1,018,363,574
RAC: 1,004
United States
Message 809740 - Posted: 19 Sep 2008, 5:09:10 UTC - in response to Message 809738.  

Seems to be a problem getting in touch with the scheduler as I'm getting HTTP Errors when Boinc tries to contact the server.

9/18/2008 10:03:09 PM||Project communication failed: attempting access to reference site
9/18/2008 10:03:11 PM||Access to reference site succeeded - project servers may be temporarily down.
9/18/2008 10:03:13 PM|SETI@home|Scheduler request failed: failed sending data to the peer
9/18/2008 10:03:13 PM|SETI@home|Deferring communication for 1 min 0 sec
9/18/2008 10:03:13 PM|SETI@home|Reason: scheduler request failed
9/18/2008 10:03:28 PM|SETI@home|Sending scheduler request: Requested by user
9/18/2008 10:03:28 PM|SETI@home|Reporting 2 tasks
9/18/2008 10:04:54 PM|SETI@home|Computation for task 24au08ag.484.13569.5.8.196_0 finished
9/18/2008 10:04:54 PM|SETI@home|Starting 19au08ad.28082.4162.3.8.106_1
9/18/2008 10:04:54 PM|SETI@home|Starting task 19au08ad.28082.4162.3.8.106_1 using setiathome_enhanced version 528
9/18/2008 10:04:56 PM|SETI@home|[file_xfer] Started upload of file 24au08ag.484.13569.5.8.196_0_0
9/18/2008 10:05:17 PM||Project communication failed: attempting access to reference site
9/18/2008 10:05:17 PM|SETI@home|[file_xfer] Temporarily failed upload of 24au08ag.484.13569.5.8.196_0_0: HTTP error
9/18/2008 10:05:17 PM|SETI@home|Backing off 1 min 0 sec on upload of file 24au08ag.484.13569.5.8.196_0_0
9/18/2008 10:05:18 PM||Access to reference site succeeded - project servers may be temporarily down.
9/18/2008 10:06:18 PM|SETI@home|[file_xfer] Started upload of file 24au08ag.484.13569.5.8.196_0_0

I just noticed that in the last 20 minutes or so as well......none of my rigs are getting through..........
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 809740 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51540
Credit: 1,018,363,574
RAC: 1,004
United States
Message 809743 - Posted: 19 Sep 2008, 5:21:48 UTC

Ahhh....the problem seems to have been short.......my rigs are connecting again....
Another blip in the history of Seti....LOL.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 809743 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 809767 - Posted: 19 Sep 2008, 6:29:39 UTC

I normally get the odd HTTP errors on my dowmload I have had it since the main problem a few weeks ago. THe WU normal then downloads either a minute or two later, so not worried/
ID: 809767 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51540
Credit: 1,018,363,574
RAC: 1,004
United States
Message 809770 - Posted: 19 Sep 2008, 6:37:50 UTC - in response to Message 809767.  
Last modified: 19 Sep 2008, 6:39:15 UTC

I normally get the odd HTTP errors on my dowmload I have had it since the main problem a few weeks ago. THe WU normal then downloads either a minute or two later, so not worried/

Well, the bandwidth seems to be about maxxxed.....

Why the continuous activity at 90 mbs/second or so???

Are they downloading data on the link again?
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 809770 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51540
Credit: 1,018,363,574
RAC: 1,004
United States
Message 809777 - Posted: 19 Sep 2008, 6:49:05 UTC - in response to Message 809774.  

I normally get the odd HTTP errors on my download I have had it since the main problem a few weeks ago. The WU normal then downloads either a minute or two later, so not worried/

Well, the bandwidth seems to be about maxxxed.....

Why the continuous activity at 90 mbs/second or so???

Are they downloading data on the link again?

Could be, That or we've got a spambot here. ;)



I knoo the possibility of DNS attacks was floated in the past,,,,,,but was discounted at the time.........might another new analysis of network traffic be in order at this time?
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 809777 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 809779 - Posted: 19 Sep 2008, 6:53:56 UTC - in response to Message 809777.  
Last modified: 19 Sep 2008, 6:54:54 UTC

I knoo the possibility of DNS attacks was floated in the past,,,,,,but was discounted at the time.........might another new analysis of network traffic be in order at this time?
I was curious about the traffic too, as I've been studying some networking subjects lately. When I started traffic was at a much more humble rate around 20Mbps on the cricket graphs. Switching to the long term view recently, however, seems to tell the grim truth: bandwidth utilisation is scaling proportionally with Moore's law...
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 809779 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51540
Credit: 1,018,363,574
RAC: 1,004
United States
Message 809783 - Posted: 19 Sep 2008, 7:03:56 UTC - in response to Message 809779.  
Last modified: 19 Sep 2008, 7:06:07 UTC

I knoo the possibility of DNS attacks was floated in the past,,,,,,but was discounted at the time.........might another new analysis of network traffic be in order at this time?
I was curious about the traffic too, as I've been studying some networking subjects lately. When I started traffic was at a much more humble rate around 20Mbps on the cricket graphs. Switching to the long term view recently, however, seems to tell the grim truth: bandwidth utilisation is scaling proportionally with Moore's law...

But is it all due to crunching, or server data transfer, or what??????
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 809783 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 809785 - Posted: 19 Sep 2008, 7:40:35 UTC - in response to Message 809779.  
Last modified: 19 Sep 2008, 8:01:17 UTC

I knoo the possibility of DNS attacks was floated in the past,,,,,,but was discounted at the time.........might another new analysis of network traffic be in order at this time?
I was curious about the traffic too, as I've been studying some networking subjects lately. When I started traffic was at a much more humble rate around 20Mbps on the cricket graphs. Switching to the long term view recently, however, seems to tell the grim truth: bandwidth utilisation is scaling proportionally with Moore's law...


Hi Mark and Jason, not too much trouble in gettin WU's, UP- or DOWNloaded, only goes in big 'chunks', like 20 to 40 WU's at a time.
On all hosts, 'large' numbers off waitin to UPload. But never mind that.
Eventually, they get UPloaded.
{EDIT/ADD}We don't have to panic, though ;)
ID: 809785 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 809797 - Posted: 19 Sep 2008, 8:29:40 UTC - in response to Message 809783.  
Last modified: 19 Sep 2008, 8:31:05 UTC

But is it all due to crunching, or server data transfer, or what??????
Well surely some peak periods would represent recovery, and /or bulk moving server data about, however the general trend, and I'm looking at the yearly cricket graph here, looks somewhere between linear and exponential growth, and has gone from ~20Mbps to 60Mbps+ [in about a year?].

Assuming from the hardware donations threads that upgrading internal network speeds is going to happen gradually, that still leaves that 100Mbps link 'up the hill'. If that connection is fibre of some sort, replacing the transceivers at either end I reckon would cost big bucks :(. I hope we'll be alright for a while longer before that capacity is required, but then look at the general hardware performance increases of the last few years, seem pretty staggering, and set to continue.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 809797 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 809938 - Posted: 19 Sep 2008, 18:55:45 UTC - in response to Message 809797.  

But is it all due to crunching, or server data transfer, or what??????
Well surely some peak periods would represent recovery, and /or bulk moving server data about, however the general trend, and I'm looking at the yearly cricket graph here, looks somewhere between linear and exponential growth, and has gone from ~20Mbps to 60Mbps+ [in about a year?].

Assuming from the hardware donations threads that upgrading internal network speeds is going to happen gradually, that still leaves that 100Mbps link 'up the hill'. If that connection is fibre of some sort, replacing the transceivers at either end I reckon would cost big bucks :(. I hope we'll be alright for a while longer before that capacity is required, but then look at the general hardware performance increases of the last few years, seem pretty staggering, and set to continue.

One of Eric's posts estimated the upgrade cost at over 30K USD, definitely not pocket change.

As to the blip, I've noticed it at recurring periods on the Cricket graph. That is, the traffic in to SSL jumps up to about 12 MBits/sec at intervals of about 3 hours and remains high for about half an hour. If I try to upload or get new work during those high periods, there are a lot of HTTP 500 errors (Internal server error) and occasionally 503 or even 403. When the rate is below 10 MBits/sec those errors are fairly rare.

I think the 90+ MBits/sec output rates recently are mostly the project uploading raw data to the NERSC HPSS at LBNL. There were a couple of periods when the project was down but there was still a steady ~35 MBits/sec flowing out, that would be the max rate of that flow though Matt said it was NICE'd so probably doesn't get that much when WUs are also being downloaded. Still, those blips at ~3 hour intervals are suspiciously close to how long it takes to upload 50 GB at 35 MBits/sec.
                                                                Joe
ID: 809938 · Report as offensive
Profile Careface

Send message
Joined: 6 Jun 03
Posts: 128
Credit: 16,561,684
RAC: 0
New Zealand
Message 810211 - Posted: 20 Sep 2008, 8:02:02 UTC
Last modified: 20 Sep 2008, 8:03:59 UTC

I'm getting rather worried now.. my "best" cruncher (laptop) is almost out of work, for some reason isnt requesting anymore, and whenever I try to upload the last 3 or so days of work, the uploads get to 100% and then fail.. is this a server problem do you think?

19/09/2008 8:00:41 p.m.|SETI@home|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 0 completed tasks
19/09/2008 8:01:00 p.m.||Project communication failed: attempting access to reference site
19/09/2008 8:01:00 p.m.|SETI@home|Temporarily failed upload of 02ap08ad.19170.23385.14.8.119_2_0: http error
19/09/2008 8:01:00 p.m.|SETI@home|Backing off 3 hr 50 min 48 sec on upload of 02ap08ad.19170.23385.14.8.119_2_0
19/09/2008 8:01:00 p.m.|SETI@home|Temporarily failed upload of 18au08ab.20784.12752.9.8.246_1_0: http error
19/09/2008 8:01:00 p.m.|SETI@home|Backing off 3 hr 58 min 31 sec on upload of 18au08ab.20784.12752.9.8.246_1_0
19/09/2008 8:01:00 p.m.|SETI@home|Started upload of 04ap08ac.18298.2936.12.8.211_2_0
19/09/2008 8:01:00 p.m.|SETI@home|Started upload of 18au08ab.10585.16842.10.8.44_1_0
19/09/2008 8:01:02 p.m.||Access to reference site succeeded - project servers may be temporarily down.

EDIT: My other cruncher can upload/download WU fine.. figured it could be the LAN connection (as my laptop is using wireless, and desktop is using wired), but I tried using both wireless and wired, and still no dice on the laptop.. *cries*
ID: 810211 · Report as offensive
1 · 2 · 3 · 4 . . . 9 · Next

Message boards : Number crunching : Panic Mode On (9) Server problems


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.