Panic Mode On (13) Server problems

Message boards : Number crunching : Panic Mode On (13) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 14 · Next

AuthorMessage
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 868305 - Posted: 23 Feb 2009, 0:12:17 UTC - in response to Message 868157.  

For the past couple of months there has been an upload jam up to one degree or another every weekend. This particular weekend is a bit worse than 'normal'.

This is an established project -- a VERY established project. It has a very large user count to support and is running close to the edge in terms of the upload/download pipe it has to work with as well as the actual processing hardware -- that is a function of the very large and active user base being supported. There is also a certain amount of tweaking going on -- say with the Astropulse client or the 'Cuda support. The thing is, with a very small margin of error to work with, that means if anything goes 'bump in the night' one sees a cascade of issues. Thus, the stated 2 to 4 hour Tuesday outage (which typically actually runs 4 to 6 hours) results in another 'catch up' challenge for the subsquent 4 to 12 hours. I suspect the uploads problem this weekend may result in 'funky times' until AFTER the Tuesday outage catch up period.

As to the Tuesday maintenance outage -- other projects may be designed to run differently.

But the thing to note here is that this project is by far the most active of the BOINC projects, and so when something goes wrong, the system gets well and truely hammered.

Then again, there are other BOINC projects out there so end users can support other projects to 'cover' for the *fragility* of SETI.




Does this happen EVERY weekend or what?? I thought this was an established project? Most Alpha & beta projects are able to run without a weekly maintenance shutdown & a weekend server outage every weekend. So what gives??


ID: 868305 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 868322 - Posted: 23 Feb 2009, 0:54:38 UTC

Well, this is something I never thought I'd see:



Kudos to Matt and the boyz for assembling a set of servers that can keep the network fully saturated through the whole weekend.
ID: 868322 · Report as offensive
Profile SATAN
Avatar

Send message
Joined: 27 Aug 06
Posts: 835
Credit: 2,129,006
RAC: 0
United Kingdom
Message 868324 - Posted: 23 Feb 2009, 0:59:22 UTC

Richard, you must have been filthy. Was it your monthly bath?
ID: 868324 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 868326 - Posted: 23 Feb 2009, 1:00:06 UTC - in response to Message 868322.  

Well, this is something I never thought I'd see:



Kudos to Matt and the boyz for assembling a set of servers that can keep the network fully saturated through the whole weekend.



Ha! beat me to it (minus the duck)

Bernie
ID: 868326 · Report as offensive
Profile PRESTON SNEAD

Send message
Joined: 28 Dec 00
Posts: 1
Credit: 299,529
RAC: 0
United States
Message 868330 - Posted: 23 Feb 2009, 1:23:14 UTC

Has anyone ever had this issue:

2/22/2009 6:12:07 PM|SETI@home|Started upload of 23ja09ab.25968.4980.6.8.156_0_0
2/22/2009 6:12:29 PM||Project communication failed: attempting access to reference site
2/22/2009 6:12:29 PM|SETI@home|Temporarily failed upload of 23ja09ab.25968.4980.6.8.156_0_0: connect() failed
2/22/2009 6:12:29 PM|SETI@home|Backing off 4 min 15 sec on upload of 23ja09ab.25968.4980.6.8.156_0_0
2/22/2009 6:12:30 PM||Internet access OK - project servers may be temporarily down.


If so, is there a way to fix it so that I can get credit for it?
ID: 868330 · Report as offensive
Profile Jack Shaftoe
Avatar

Send message
Joined: 19 Aug 04
Posts: 44
Credit: 2,343,242
RAC: 0
United States
Message 868335 - Posted: 23 Feb 2009, 1:33:23 UTC - in response to Message 868330.  
Last modified: 23 Feb 2009, 1:37:20 UTC

Nope, never, no idea what you are talking about.

/sarcasm...
ID: 868335 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 868337 - Posted: 23 Feb 2009, 1:38:05 UTC - in response to Message 868335.  

Nope, never, no idea what you are talking about.

Give him his leg back.

Everyone is havibg this problem at the moment.


BOINC WIKI
ID: 868337 · Report as offensive
Keith Wortham

Send message
Joined: 10 Jan 06
Posts: 5
Credit: 5,992,626
RAC: 0
United States
Message 868340 - Posted: 23 Feb 2009, 1:57:04 UTC

Is there any danger of this lasting so that I cannot upload my work units by the 10th of March?
ID: 868340 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 868343 - Posted: 23 Feb 2009, 1:59:18 UTC - in response to Message 868340.  

Is there any danger of this lasting so that I cannot upload my work units by the 10th of March?



No, it should be cleared by Wednesday by the latest. Then we can do it all over again next weekend. :)


PROUD MEMBER OF Team Starfire World BOINC
ID: 868343 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 868352 - Posted: 23 Feb 2009, 2:27:51 UTC - in response to Message 868343.  

Is there any danger of this lasting so that I cannot upload my work units by the 10th of March?



No, it should be cleared by Wednesday by the latest. Then we can do it all over again next weekend. :)


Even if it does last that long, you would still get credit for them because nobody else could return them until then.

ID: 868352 · Report as offensive
Keith Wortham

Send message
Joined: 10 Jan 06
Posts: 5
Credit: 5,992,626
RAC: 0
United States
Message 868359 - Posted: 23 Feb 2009, 2:43:13 UTC - in response to Message 868343.  

Is there any danger of this lasting so that I cannot upload my work units by the 10th of March?



No, it should be cleared by Wednesday by the latest. Then we can do it all over again next weekend. :)


I suppose that I should go back to my 15 year old Pentium processor so that we don't ask too much of their servers? I apologize. I thought that the purpose of this was to construct a large virtual supercomputer.

Oh well, I will snail mail my results in. What is the US Postal address? The results will come in by media mail.
ID: 868359 · Report as offensive
Profile Raptor

Send message
Joined: 23 Jun 02
Posts: 7
Credit: 1,385,878
RAC: 0
Canada
Message 868372 - Posted: 23 Feb 2009, 3:17:45 UTC
Last modified: 23 Feb 2009, 3:50:21 UTC

For some reason, my BOINC isn't recognizing my Internet connection. I've updated my antivirus and spyware as well as visited webpages. Even as I'm typing this I'm hitting the "Retry Now" button and it gives me that BOINC is unable to communicate with the Internet, please check your...

I've never had to touch my net settings before in BOINC, its just started now....but what I don't get is that it did communicate with SETI@home just before this. Heres the message:

22/02/2009 9:44:55 PM|SETI@home|Computation for task 12ja09ad.18522.481.6.8.115_0 finished
22/02/2009 9:44:57 PM|World Community Grid|Restarting task E000433_762A_002g0500y_0 using cep1 version 628
22/02/2009 9:44:58 PM|SETI@home|Started upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:45:20 PM||Project communication failed: attempting access to reference site
22/02/2009 9:45:20 PM|SETI@home|Temporarily failed upload of 12ja09ad.18522.481.6.8.115_0_0: HTTP error
22/02/2009 9:45:20 PM|SETI@home|Backing off 1 min 0 sec on upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:45:21 PM||Internet access OK - project servers may be temporarily down.
22/02/2009 9:46:20 PM|SETI@home|Started upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:46:42 PM||Project communication failed: attempting access to reference site
22/02/2009 9:46:42 PM|SETI@home|Temporarily failed upload of 12ja09ad.18522.481.6.8.115_0_0: HTTP error
22/02/2009 9:46:42 PM|SETI@home|Backing off 1 min 0 sec on upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:46:43 PM||Internet access OK - project servers may be temporarily down.
22/02/2009 9:47:42 PM|SETI@home|Started upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:48:04 PM||Project communication failed: attempting access to reference site
22/02/2009 9:48:04 PM|SETI@home|Temporarily failed upload of 12ja09ad.18522.481.6.8.115_0_0: HTTP error
22/02/2009 9:48:04 PM|SETI@home|Backing off 1 min 0 sec on upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:48:05 PM||Internet access OK - project servers may be temporarily down.
22/02/2009 9:49:04 PM|SETI@home|Started upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:49:26 PM||Project communication failed: attempting access to reference site
22/02/2009 9:49:26 PM|SETI@home|Temporarily failed upload of 12ja09ad.18522.481.6.8.115_0_0: HTTP error
22/02/2009 9:49:26 PM|SETI@home|Backing off 1 min 0 sec on upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:49:48 PM||BOINC can't access Internet - check network connection or proxy configuration.
22/02/2009 9:50:26 PM|SETI@home|Started upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:50:48 PM|SETI@home|Temporarily failed upload of 12ja09ad.18522.481.6.8.115_0_0: HTTP error
22/02/2009 9:50:48 PM|SETI@home|Backing off 1 min 0 sec on upload of 12ja09ad.18522.481.6.8.115_0_0

Any ideas what I can do?

Quick edit....it went back to saying the SETI servers could be down...just checked and aside from a validator (not running) and a few splitters(disabled) everything is green.
ID: 868372 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 868381 - Posted: 23 Feb 2009, 3:52:49 UTC - in response to Message 868372.  

For some reason, my BOINC isn't recognizing my Internet connection. I've updated my antivirus and spyware as well as visited webpages. Even as I'm typing this I'm hitting the "Retry Now" button and it gives me that BOINC is unable to communicate with the Internet, please check your...

I've never had to touch my net settings before in BOINC, its just started now....but what I don't get is that it did communicate with SETI@home just before this. Heres the message:

22/02/2009 9:44:55 PM|SETI@home|Computation for task 12ja09ad.18522.481.6.8.115_0 finished
22/02/2009 9:44:57 PM|World Community Grid|Restarting task E000433_762A_002g0500y_0 using cep1 version 628
22/02/2009 9:44:58 PM|SETI@home|Started upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:45:20 PM||Project communication failed: attempting access to reference site
22/02/2009 9:45:20 PM|SETI@home|Temporarily failed upload of 12ja09ad.18522.481.6.8.115_0_0: HTTP error
22/02/2009 9:45:20 PM|SETI@home|Backing off 1 min 0 sec on upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:45:21 PM||Internet access OK - project servers may be temporarily down.
22/02/2009 9:46:20 PM|SETI@home|Started upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:46:42 PM||Project communication failed: attempting access to reference site
22/02/2009 9:46:42 PM|SETI@home|Temporarily failed upload of 12ja09ad.18522.481.6.8.115_0_0: HTTP error
22/02/2009 9:46:42 PM|SETI@home|Backing off 1 min 0 sec on upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:46:43 PM||Internet access OK - project servers may be temporarily down.
22/02/2009 9:47:42 PM|SETI@home|Started upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:48:04 PM||Project communication failed: attempting access to reference site
22/02/2009 9:48:04 PM|SETI@home|Temporarily failed upload of 12ja09ad.18522.481.6.8.115_0_0: HTTP error
22/02/2009 9:48:04 PM|SETI@home|Backing off 1 min 0 sec on upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:48:05 PM||Internet access OK - project servers may be temporarily down.
22/02/2009 9:49:04 PM|SETI@home|Started upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:49:26 PM||Project communication failed: attempting access to reference site
22/02/2009 9:49:26 PM|SETI@home|Temporarily failed upload of 12ja09ad.18522.481.6.8.115_0_0: HTTP error
22/02/2009 9:49:26 PM|SETI@home|Backing off 1 min 0 sec on upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:49:48 PM||BOINC can't access Internet - check network connection or proxy configuration.
22/02/2009 9:50:26 PM|SETI@home|Started upload of 12ja09ad.18522.481.6.8.115_0_0
22/02/2009 9:50:48 PM|SETI@home|Temporarily failed upload of 12ja09ad.18522.481.6.8.115_0_0: HTTP error
22/02/2009 9:50:48 PM|SETI@home|Backing off 1 min 0 sec on upload of 12ja09ad.18522.481.6.8.115_0_0

Any ideas what I can do?

Quick edit....it went back to saying the SETI servers could be down...just checked and aside from a validator (not running) and a few splitters(disabled) everything is green.

You and everyone else. There is a plethora of threads on this subject at the moment.


BOINC WIKI
ID: 868381 · Report as offensive
Profile Raptor

Send message
Joined: 23 Jun 02
Posts: 7
Credit: 1,385,878
RAC: 0
Canada
Message 868383 - Posted: 23 Feb 2009, 4:00:09 UTC - in response to Message 868381.  

Oh sorry, forgot to use the Search bar....thanks for answering though.
ID: 868383 · Report as offensive
Stoffe

Send message
Joined: 24 Mar 00
Posts: 11
Credit: 559,914
RAC: 0
Sweden
Message 868405 - Posted: 23 Feb 2009, 5:14:08 UTC

Now what exactly is it that takes time now? Shouldn't the responsible be able to fix the problem or at least come up with at temporary solution in all this time? Wait until Wednesday... *roll eyes*
ID: 868405 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 868409 - Posted: 23 Feb 2009, 5:21:35 UTC - in response to Message 868405.  

Now what exactly is it that takes time now? Shouldn't the responsible be able to fix the problem or at least come up with at temporary solution in all this time? Wait until Wednesday... *roll eyes*

It is the weekend. The staff is 3 people. They are NOT on call 24/7. If it is easy, they can dial in from home for a quick fix. If it is hard, it usually looks like:

Monday: Diagnose problem.
Tuesday: Fix problem, normal Tuesday outage is extended to cover added time for fixes.
Wednesday: We are all recovering.


BOINC WIKI
ID: 868409 · Report as offensive
Profile Jack Zhang
Volunteer tester
Avatar

Send message
Joined: 2 Jul 06
Posts: 206
Credit: 6,142,449
RAC: 0
Canada
Message 868411 - Posted: 23 Feb 2009, 5:24:20 UTC - in response to Message 868405.  

All I'm doing is using my GPU to do folding until this is finally settled.

Seriously, I think it's time for true Gigabit pipe for the servers. That 100mbps box has got to go and be replaced by fiber.


What if Fiction was Fact and Fact was Fiction and vice versa?
ID: 868411 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 868414 - Posted: 23 Feb 2009, 5:26:41 UTC - in response to Message 868411.  

All I'm doing is using my GPU to do folding until this is finally settled.

Seriously, I think it's time for true Gigabit pipe for the servers. That 100mbps box has got to go and be replaced by fiber.


Do you have the extra $100,000 in your pocket? The problem is a mile or so run up a hill...


BOINC WIKI
ID: 868414 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 868420 - Posted: 23 Feb 2009, 5:34:02 UTC - in response to Message 868414.  

All I'm doing is using my GPU to do folding until this is finally settled.

Seriously, I think it's time for true Gigabit pipe for the servers. That 100mbps box has got to go and be replaced by fiber.


Do you have the extra $100,000 in your pocket? The problem is a mile or so run up a hill...

What no lottery Millionaires to help with the last mile?
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 868420 · Report as offensive
Hanford WA4LZC
Avatar

Send message
Joined: 15 May 99
Posts: 38
Credit: 10,129,207
RAC: 0
United States
Message 868423 - Posted: 23 Feb 2009, 5:44:04 UTC - in response to Message 868420.  

What no lottery Millionaires to help with the last mile?


I was 2 numbers short of doing just that...... ;-(
ID: 868423 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 14 · Next

Message boards : Number crunching : Panic Mode On (13) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.