Panic Mode On (80) Server Problems?

Message boards : Number crunching : Panic Mode On (80) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 24 · Next

AuthorMessage
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1324000 - Posted: 3 Jan 2013, 4:40:47 UTC
Last modified: 3 Jan 2013, 4:42:24 UTC

I've gotten 26 so far.

Back of the envelope calculations suggest at a rate of 25 units per second could be downloaded, limited due to the pipe's bandwidth. This guesstimate doesn't include any outgoing bandwidth used to for control and status requests.

Now since a significant number of us are using GPUs to help do the heavy lifting, a two day outage would have wiped out most if not all of our 100 unit queue so everybody wants 100 GPU units. That comes out to only filling 900 hosts per hour, best case. So how many active hosts out there crunching with one or more GPUs? 10,000? 20,000? So it's going to take a dozen hours or two to handle demand and that's doesn't include the fact that for the fastest among us could crunch 100 GPU units in under an hour and a good number of us in under a day. It'll take a bit.

And since I've started writing this, I got 41 more.
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1324000 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13994
Credit: 208,696,464
RAC: 304
Australia
Message 1324007 - Posted: 3 Jan 2013, 5:18:24 UTC - in response to Message 1324000.  

So it's going to take a dozen hours or two to handle demand and that's doesn't include the fact that for the fastest among us could crunch 100 GPU units in under an hour and a good number of us in under a day. It'll take a bit.

My GTX 560Ti does roughly 200 WUs per day- if they are all 22min crunching time. Some are longer, most are shorter.
On average it probably does closer to 300WU per day. When it can get them.

Grant
Darwin NT
ID: 1324007 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 38497
Credit: 261,360,520
RAC: 489
Australia
Message 1324036 - Posted: 3 Jan 2013, 7:53:19 UTC - in response to Message 1324007.  

It seems that the feeder is having a bit of a problem feeding us.

Cheers.
ID: 1324036 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9960
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1324041 - Posted: 3 Jan 2013, 8:14:06 UTC

Still just getting VLARS for the CPU. Never seen this many at once. Nothing for GPU.
ID: 1324041 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1324051 - Posted: 3 Jan 2013, 9:35:52 UTC - in response to Message 1324041.  

Well so far I've gotten all of 67 MB units. 56 GPU, 11 CPU, none VLAR.

I'm now at 99 GPU and 46 CPU.

Like I said before, best super ideal guesstimate is that maybe 900 hosts can be fed 100 units per hour. At the current cricket levels I'll guess maybe 500 hosts per hour.

So how many hosts want GPU units? How many do an Oliver Twist and ask for another full reload in 3/6/12/24 hours?
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1324051 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51580
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1324052 - Posted: 3 Jan 2013, 9:42:36 UTC - in response to Message 1324041.  

Still just getting VLARS for the CPU. Never seen this many at once. Nothing for GPU.

I set 'use nVidia GPU' to no for the night, will try to build some CPU cache.
Otherwise the rigs were just asking for GPU and getting NOTHING.
I'll enable GPU in the morning and see if anything's changed.
Thanks for the heads up.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1324052 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13994
Credit: 208,696,464
RAC: 304
Australia
Message 1324054 - Posted: 3 Jan 2013, 9:53:42 UTC - in response to Message 1324052.  


I've been getting a mix of GPU & CPU work. Naturally the one with the faster GPU is struggling to get GPU work, the one with the faster CPU struggling to get CPU work.
Probably getting work with every 20-30 requests.


Certainly seems like the feeder is the culprit- plenty of work there, just none available.
I suspect they might have tweaked a few things after the last outage- Scheduler responses are coming within 3-5 seconds. Before the outage, if it didn't timeout, it was a minute or 3.
Grant
Darwin NT
ID: 1324054 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 55,390,408
RAC: 69
United Kingdom
Message 1324062 - Posted: 3 Jan 2013, 11:48:40 UTC

My retry button is going to sue me for abuse, though i finaly got
03/01/2013 11:36:58 | SETI@home | Scheduler request completed: got 87 new tasks
that will not last the day out, at least its somthing.
ID: 1324062 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1324070 - Posted: 3 Jan 2013, 12:02:48 UTC

For th last two days I have two pc's that are unable to report completed tasks.

Rampage

364 SETI@home 1/3/2013 5:59:04 AM Scheduler request failed: Couldn't connect to server

any one else getting this or now what might be causing it?

Frank
ID: 1324070 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1324109 - Posted: 3 Jan 2013, 14:07:38 UTC - in response to Message 1324070.  

Solved..... turned off the proxy and it resumed working.
ID: 1324109 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1324252 - Posted: 3 Jan 2013, 17:51:37 UTC
Last modified: 3 Jan 2013, 17:56:47 UTC

Something's definitely going on. When the Servers came back up yesterday I had 24 AstroPulses remaining. I've only received a handful since yesterday. I'm down to my last 3 at present. You can look around and see I'm not alone. Most of the computers I saw were out or only had a couple, one had 14. All I get is "Project has no tasks available" when asking for APs, even though the server page lists 25,000 ready to send.

The last one finished in 12 minutes, now I have 2...
ID: 1324252 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51580
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1324253 - Posted: 3 Jan 2013, 17:54:44 UTC - in response to Message 1324252.  

Something's definitely going on. When the Servers came back up yesterday I had 24 AstroPulses remaining. I've only received a handful since yesterday. I'm down to my last 3 at present. You can look around and see I'm not alone. Most of the computers I saw were out or only had a couple, one had 14. All I get is "Project has no tasks available" when asking for APs, even though the server page lists 25,000 ready to send.

Yeah, either something is not working quite right, or they deliberately changed some settings.
Even though there is beaucoup work ready to send, the bandwidth is not being fully utilized. It would appear to me that perhaps the scheduler/feeder combo might not be up to speed.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1324253 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1324255 - Posted: 3 Jan 2013, 18:04:50 UTC - in response to Message 1324253.  

Something's definitely going on. When the Servers came back up yesterday I had 24 AstroPulses remaining. I've only received a handful since yesterday. I'm down to my last 3 at present. You can look around and see I'm not alone. Most of the computers I saw were out or only had a couple, one had 14. All I get is "Project has no tasks available" when asking for APs, even though the server page lists 25,000 ready to send.

Yeah, either something is not working quite right, or they deliberately changed some settings.
Even though there is beaucoup work ready to send, the bandwidth is not being fully utilized. It would appear to me that perhaps the scheduler/feeder combo might not be up to speed.

I just received 2 more. If those go as fast as the last dozen or so, I'm good for about an hour. When looking around I also noticed quite a few people were below the 200 task limit. I'm well below 200 because I'm saving space for those elusive APs.
ID: 1324255 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34648
Credit: 79,922,639
RAC: 80
Germany
Message 1324256 - Posted: 3 Jan 2013, 18:13:16 UTC

I just got 100 units in less than 30 minutes.
Download speed was ~150 KB/sec.
But also every second retry gave me no tasks available.

With each crime and every kindness we birth our future.
ID: 1324256 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1324272 - Posted: 3 Jan 2013, 18:42:59 UTC - in response to Message 1324256.  
Last modified: 3 Jan 2013, 18:47:20 UTC

I just got 100 units in less than 30 minutes.
Download speed was ~150 KB/sec.
But also every second retry gave me no tasks available.

I see 3 completed, a long time ago. This is you right?
All AstroPulse v6 tasks for computer 5735690

The other one shows 3 in progress, I have 3 waiting.
All AstroPulse v6 tasks for computer 359
ID: 1324272 · Report as offensive
mikeej42

Send message
Joined: 26 Oct 00
Posts: 109
Credit: 791,875,385
RAC: 9
United States
Message 1324286 - Posted: 3 Jan 2013, 19:12:54 UTC - in response to Message 1324256.  

I just got 100 units in less than 30 minutes.
Download speed was ~150 KB/sec.
But also every second retry gave me no tasks available.

If you ever get some assigned they download very fast here, even APs.

I have 20 machines over 90, most are between 20-75 tasks, and still have 10 machines at 0.
ID: 1324286 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9960
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1324290 - Posted: 3 Jan 2013, 19:18:40 UTC

Yes the downloads are some of the fastest I have seen peaking at 350k here.

Still have not received any GPU tasks, plenty of CPU!
ID: 1324290 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1324312 - Posted: 3 Jan 2013, 19:44:27 UTC - in response to Message 1324253.  

Something's definitely going on. When the Servers came back up yesterday I had 24 AstroPulses remaining. I've only received a handful since yesterday. I'm down to my last 3 at present. You can look around and see I'm not alone. Most of the computers I saw were out or only had a couple, one had 14. All I get is "Project has no tasks available" when asking for APs, even though the server page lists 25,000 ready to send.

Yeah, either something is not working quite right, or they deliberately changed some settings.
Even though there is beaucoup work ready to send, the bandwidth is not being fully utilized. It would appear to me that perhaps the scheduler/feeder combo might not be up to speed.

Perhaps they turned it down a bit to relieve some stress on the backend network. I think Matt hand mentioned there might be some issues going on there in his last news post.

SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1324312 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1324361 - Posted: 3 Jan 2013, 20:30:34 UTC

I'm on my last AstroPulse task, and all I get is;
1/3/2013 3:19:13 PM | SETI@home | Sending scheduler request: To fetch work.
1/3/2013 3:19:13 PM | SETI@home | Reporting 1 completed tasks, requesting new tasks for ATI
1/3/2013 3:19:16 PM | SETI@home | Scheduler request completed: got 0 new tasks
1/3/2013 3:19:16 PM | SETI@home | Project has no tasks available
1/3/2013 3:22:57 PM | SETI@home | Computation for task ap_10oc12aa_B1_P0_00386_20130101_03214.wu_1 finished
1/3/2013 3:22:57 PM | SETI@home | Starting task ap_09oc12ad_B4_P0_00164_20121231_10201.wu_0 using astropulse_v6 version 604 (ati_opencl_100) in slot 1
1/3/2013 3:22:59 PM | SETI@home | Started upload of ap_10oc12aa_B1_P0_00386_20130101_03214.wu_1_0
1/3/2013 3:23:02 PM | SETI@home | Finished upload of ap_10oc12aa_B1_P0_00386_20130101_03214.wu_1_0
1/3/2013 3:24:21 PM | SETI@home | Sending scheduler request: To fetch work.
1/3/2013 3:24:21 PM | SETI@home | Reporting 1 completed tasks, requesting new tasks for ATI
1/3/2013 3:24:24 PM | SETI@home | Scheduler request completed: got 0 new tasks
1/3/2013 3:24:24 PM | SETI@home | Project has no tasks available


Meanwhile, Results ready to send: 25,240

Is it time to go back to MultiBeams on the AMD?
ID: 1324361 · Report as offensive
Rolf

Send message
Joined: 16 Jun 09
Posts: 114
Credit: 7,817,146
RAC: 0
Switzerland
Message 1324372 - Posted: 3 Jan 2013, 21:19:40 UTC - in response to Message 1324361.  

Is it time to go back to MultiBeams on the AMD?

Not necessarily. You will get the same answer:
03.01.2013 21:33:58 | SETI@home | Requesting new tasks for CPU
03.01.2013 21:34:01 | SETI@home | Scheduler request completed: got 0 new tasks
03.01.2013 21:34:01 | SETI@home | Project has no tasks available

N.B. times are UTC+1
ID: 1324372 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 24 · Next

Message boards : Number crunching : Panic Mode On (80) Server Problems?


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.