Panic Mode On (16) Server problems

Message boards : Number crunching : Panic Mode On (16) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 11 · Next

AuthorMessage
Profile cliff west

Send message
Joined: 7 May 01
Posts: 211
Credit: 16,180,728
RAC: 15
United States
Message 903849 - Posted: 5 Jun 2009, 3:31:46 UTC

well i just update software (.31), new hardware (gtx 260) for cuda, and up my cashe to 4 days. so far

6/4/2009 8:29:08 PM SETI@home Reporting 1 completed tasks, requesting new tasks
6/4/2009 8:29:13 PM SETI@home Scheduler request completed: got 0 new tasks
6/4/2009 8:29:13 PM SETI@home Message from server: No work sent
6/4/2009 8:29:28 PM SETI@home Sending scheduler request: To fetch work.
6/4/2009 8:29:28 PM SETI@home Requesting new tasks
6/4/2009 8:29:33 PM SETI@home Scheduler request completed: got 0 new tasks
6/4/2009 8:29:33 PM SETI@home Message from server: (Project has no jobs available)
6/4/2009 8:30:48 PM SETI@home Sending scheduler request: To fetch work.
6/4/2009 8:30:48 PM SETI@home Requesting new tasks
6/4/2009 8:30:53 PM SETI@home Scheduler request completed: got 0 new tasks
6/4/2009 8:30:53 PM SETI@home Message from server: (Project has no jobs available)

ID: 903849 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 903857 - Posted: 5 Jun 2009, 4:01:41 UTC - in response to Message 903849.  


Network traffic has slumped again, and getting plenty of
5/06/2009 13:07:42 SETI@home Message from server: (Project has no jobs available)
messages and no work.

Grant
Darwin NT
ID: 903857 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 903860 - Posted: 5 Jun 2009, 4:28:52 UTC - in response to Message 903857.  
Last modified: 5 Jun 2009, 4:33:43 UTC


Network traffic has slumped again, and getting plenty of
5/06/2009 13:07:42 SETI@home Message from server: (Project has no jobs available)
messages and no work.


Finally got some work after all the "no jobs available messages"
Now getting
5/06/2009 13:54:28 SETI@home Scheduler request failed: Couldn't connect to server


EDIT- now getting work again & no Scheduler connection failure messages. Just checked the network taffic again. Downloads were still dead but inbound traffic had surged (hence the problems contacting the scheduler).
Outbound traffic has now picked up again, inbound reducing. Looks like things have settled down again.
Grant
Darwin NT
ID: 903860 · Report as offensive
Chelski
Avatar

Send message
Joined: 3 Jan 00
Posts: 121
Credit: 8,979,050
RAC: 0
Malaysia
Message 903924 - Posted: 5 Jun 2009, 11:26:08 UTC

It is very clear the last couple of weeks, when AP work becomes available, the bandwidth becomes almost maxed out. And when no more AP work left, it drops almost immediately.

The end effect is that the AP clients are getting starved. When the work are availabe, most of the time my crunchers can't even upload finished work, or didn't get through to ask for work. And when connection is restored, the AP WUs are gone.

Seems like there are too many "AP only" crunchers who are not accepting MBs? If the project thinks that we need to temporarily convert all clients to MB to clear the backlog, please tell us so.
ID: 903924 · Report as offensive
MonorailPilot

Send message
Joined: 12 May 99
Posts: 8
Credit: 4,422,543
RAC: 0
United States
Message 903952 - Posted: 5 Jun 2009, 13:50:13 UTC - in response to Message 901411.  


I will agree with AP being an opt-in sort of deal, but after looking at one of the causes of all of my pending credits/tasks.. AP should not be allowed on hosts that have less than a 250 RAC. That requirement makes sure they're going to actually try to do some crunching. 90% of the wingmen I get have less than 5 tasks, are brand new rigs, or have RACs of less than 10..because of one/both of the previous two reasons.


Which will eliminate at least 1 of my rigs crunching AP permanently.

Now, if you want to tell me that someone needs to have returned at least 10 MB's or something, I don't have an issue with that. But there are plenty of people out there with rigs in the 100-200 rac that are MORE then capable of crunching AP.
ID: 903952 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 903957 - Posted: 5 Jun 2009, 14:09:16 UTC - in response to Message 903952.  

Now, if you want to tell me that someone needs to have returned at least 10 MB's or something, I don't have an issue with that. But there are plenty of people out there with rigs in the 100-200 rac that are MORE then capable of crunching AP.



Even with a 10 Multibeam limit there is going to be a problem with new people getting hit with APs for the first time. With the estimated time to completion inflated so badly at first the new cruncher takes one look at that and figures he will never be able to complete them so he just deletes BOINC and is never heard from again leaving whatever work was on their machine to time out.

It seems to me about the best way to handle this is to make Astropulse work strictly opt-in so that you actually have to come into the forums and find out what it is all about before you get any. This way, any of the set and forget crowd will be happily crunching away on MBs and wont even know what they are missing. Those that actually look at their results will eventually start to wonder why some of their wingmen are getting so much more credit and come into the forums to find out.


PROUD MEMBER OF Team Starfire World BOINC
ID: 903957 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 904016 - Posted: 5 Jun 2009, 17:21:34 UTC - in response to Message 903952.  

...there are plenty of people out there with rigs in the 100-200 rac that are MORE then capable of crunching AP.

And there are hosts out there with S@H RAC of 3 or less which are more than capable of crunching AP, low RAC may simply mean actively working for many projects. If such a host gets an AP WU it finishes within deadline, but will build LTD and not ask for work here for quite some time.
                                                                 Joe
ID: 904016 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 904481 - Posted: 6 Jun 2009, 21:20:13 UTC


Looks like there's been a few problems overnight- network traffic graphs look like a sawtooth.
Grant
Darwin NT
ID: 904481 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 904688 - Posted: 7 Jun 2009, 8:05:37 UTC

As always.. cause for panic when the result creation rate is NULL/sec.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 904688 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 904692 - Posted: 7 Jun 2009, 8:28:42 UTC
Last modified: 7 Jun 2009, 8:35:36 UTC






Yes.. the graph is looking strange..


[no static, up-to-date pic]


ID: 904692 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 904747 - Posted: 7 Jun 2009, 11:43:41 UTC - in response to Message 904692.  






Yes.. the graph is looking strange..


[no static, up-to-date pic]


Looks like a Kittyman mood graph.......higher than a kite, then crash....LOL.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 904747 · Report as offensive
Elphidieus

Send message
Joined: 1 Nov 02
Posts: 67
Credit: 3,140,607
RAC: 0
Malaysia
Message 904789 - Posted: 7 Jun 2009, 14:07:27 UTC

Yay...!!! I Can't Upload WUs Again...!!! :x
ID: 904789 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 66199
Credit: 55,293,173
RAC: 49
United States
Message 904790 - Posted: 7 Jun 2009, 14:11:16 UTC - in response to Message 904747.  
Last modified: 7 Jun 2009, 14:12:20 UTC



Yes.. the graph is looking strange..


[no static, up-to-date pic]


Looks like a Kittyman mood graph.......higher than a kite, then crash....LOL.

Ok, Somebody left Ben's Kite up there. Uploads are either slow or stalled depending on traffic.
Savoir-Faire is everywhere!
The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST

ID: 904790 · Report as offensive
Chelski
Avatar

Send message
Joined: 3 Jan 00
Posts: 121
Credit: 8,979,050
RAC: 0
Malaysia
Message 904795 - Posted: 7 Jun 2009, 14:24:18 UTC

No problem yet on my side to upload and report task. But looking at number of AP on the field, it will soon be time to switch a few cruncers to MB
ID: 904795 · Report as offensive
Profile Labbie
Avatar

Send message
Joined: 19 Jun 06
Posts: 4083
Credit: 5,930,102
RAC: 0
United States
Message 904798 - Posted: 7 Jun 2009, 14:31:47 UTC

The upload issue just cleared itself here.


Calm Chaos Forum...Join Calm Chaos Now
ID: 904798 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 66199
Credit: 55,293,173
RAC: 49
United States
Message 904905 - Posted: 7 Jun 2009, 18:54:41 UTC - in response to Message 904798.  

The upload issue just cleared itself here.

It's down again, Or a horde is trying to upload.

06/07/09 11:49:09	SETI@home	Started upload of 20mr09ac.8447.7434.16.8.28_0_0
06/07/09 11:49:31			Project communication failed: attempting access to reference site
06/07/09 11:49:31	SETI@home	Temporarily failed upload of 20mr09ac.8447.7434.16.8.28_0_0: HTTP error
06/07/09 11:49:31	SETI@home	Backing off 1 min 32 sec on upload of 20mr09ac.8447.7434.16.8.28_0_0
06/07/09 11:49:31	SETI@home	Started upload of 20mr09ac.8447.7434.16.8.24_0_0
06/07/09 11:49:33			Internet access OK - project servers may be temporarily down.
06/07/09 11:49:58			Project communication failed: attempting access to reference site
06/07/09 11:49:58	SETI@home	Temporarily failed upload of 20mr09ab.23962.306381.7.8.188_1_0: HTTP error
06/07/09 11:49:58	SETI@home	Backing off 11 min 26 sec on upload of 20mr09ab.23962.306381.7.8.188_1_0
06/07/09 11:49:58	SETI@home	Started upload of 20mr09ac.4122.19704.3.8.24_0_0
06/07/09 11:49:59			Internet access OK - project servers may be temporarily down.

Savoir-Faire is everywhere!
The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST

ID: 904905 · Report as offensive
Rob.B

Send message
Joined: 23 Jul 99
Posts: 157
Credit: 1,439,682
RAC: 0
United Kingdom
Message 904910 - Posted: 7 Jun 2009, 19:04:59 UTC
Last modified: 7 Jun 2009, 19:05:28 UTC

No problems uploading completed work unit's just can't seem to report them.


07/06/2009 19:56:59 SETI@home Reporting 12 completed tasks, not requesting new tasks
07/06/2009 19:57:21 Project communication failed: attempting access to reference site
07/06/2009 19:57:23 Internet access OK - project servers may be temporarily down.
07/06/2009 19:57:24 SETI@home Scheduler request failed: Failure when receiving data from the peer
ID: 904910 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 904912 - Posted: 7 Jun 2009, 19:09:56 UTC
Last modified: 7 Jun 2009, 19:26:33 UTC

Trying to get some Cuda or anything.
Database says only 192 ready to send so someones going to be unlucky.lol
2 AP's there though.


EDIT: Panic over replica cought up there are 47k plus MB's
ID: 904912 · Report as offensive
Rob.B

Send message
Joined: 23 Jul 99
Posts: 157
Credit: 1,439,682
RAC: 0
United Kingdom
Message 904917 - Posted: 7 Jun 2009, 19:14:57 UTC

I have a three day cache currently, so I've stopped requesting work. I think the network is having a bit of a funny turn (looking at the cricket graph) at the moment, so I'll not add to it's woe's.
ID: 904917 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 904922 - Posted: 7 Jun 2009, 19:19:33 UTC - in response to Message 904917.  

I ran my cache down so I could rebuild my machine in
a new case and clean out the dust bunnies.
I am trying to get some work.
I have about 1 day or so now need to cover the outage
then I will stop reqesting also.
ID: 904922 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 11 · Next

Message boards : Number crunching : Panic Mode On (16) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.