Panic Mode On (76) Server Problems?

Message boards : Number crunching : Panic Mode On (76) Server Problems?

To post messages, you must log in.

Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 7490
Credit: 91,162,254
RAC: 46,319
Australia
Message 1278727 - Posted: 1 Sep 2012, 8:13:08 UTC - in response to Message 1278722.  


No longer getting Scheduler errors & timeouts. For the last 40 min anyway.
Grant
Darwin NT
ID: 1278727 · Report as offensive
Lee Gresham
Avatar

Send message
Joined: 12 Aug 03
Posts: 159
Credit: 130,116,228
RAC: 0
United States
Message 1279407 - Posted: 2 Sep 2012, 22:36:40 UTC - in response to Message 1278595.  

I set NNT yesterday morning when I noticed a truck load of pending downloads. This morning I had 36 task to down load, Even with button pushing I still have 13 to download. I have Einstein running now. Being a holiday weekend I dont expect the lab to get things up untill after the Tuesday outage.



Work download problems gone on all 4 computers, however, I just caught the GPU on 1 PC running VLARs again. Haven't checked the other PCs yet.
Delta-V
ID: 1279407 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 39,902,384
RAC: 27,960
United Kingdom
Message 1279473 - Posted: 3 Sep 2012, 2:13:51 UTC - in response to Message 1279407.  

Work download problems gone on all 4 computers, however, I just caught the GPU on 1 PC running VLARs again. Haven't checked the other PCs yet.

VLAR`s !!
On a GPU ??
Pray tell how did you manage that ?
I have not seen a vlar in weeks.
I did not know that they still made them.
I WANT VLAR`S.
ID: 1279473 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 28,721,888
RAC: 37
United States
Message 1279486 - Posted: 3 Sep 2012, 2:46:37 UTC - in response to Message 1279473.  

Work download problems gone on all 4 computers, however, I just caught the GPU on 1 PC running VLARs again. Haven't checked the other PCs yet.

VLAR`s !!
On a GPU ??
Pray tell how did you manage that ?
I have not seen a vlar in weeks.
I did not know that they still made them.
I WANT VLAR`S.


Impossible! The servers have been fixed so that can never occur.
ID: 1279486 · Report as offensive
ClaggyProject Donor
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4623
Credit: 46,352,825
RAC: 2,947
United Kingdom
Message 1279517 - Posted: 3 Sep 2012, 4:36:37 UTC - in response to Message 1279486.  

Work download problems gone on all 4 computers, however, I just caught the GPU on 1 PC running VLARs again. Haven't checked the other PCs yet.

VLAR`s !!
On a GPU ??
Pray tell how did you manage that ?
I have not seen a vlar in weeks.
I did not know that they still made them.
I WANT VLAR`S.


Impossible! The servers have been fixed so that can never occur.

He received those tasks 3 weeks ago, and has only just got round to starting them,

Claggy
ID: 1279517 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 28,721,888
RAC: 37
United States
Message 1279554 - Posted: 3 Sep 2012, 7:12:02 UTC - in response to Message 1279517.  


He received those tasks 3 weeks ago, and has only just got round to starting them,

Claggy


Better late than never.
ID: 1279554 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 39,902,384
RAC: 27,960
United Kingdom
Message 1279609 - Posted: 3 Sep 2012, 11:23:40 UTC - in response to Message 1279486.  

Work download problems gone on all 4 computers, however, I just caught the GPU on 1 PC running VLARs again. Haven't checked the other PCs yet.

VLAR`s !!
On a GPU ??
Pray tell how did you manage that ?
I have not seen a vlar in weeks.
I did not know that they still made them.
I WANT VLAR`S.


Impossible! The servers have been fixed so that can never occur.

I see it as the servers have been fubared so I/we dont get them :(
ID: 1279609 · Report as offensive
Profile arkaynProject Donor
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4098
Credit: 51,576,341
RAC: 968
United States
Message 1280009 - Posted: 4 Sep 2012, 20:12:24 UTC

And we are back from another weekly backup.

ID: 1280009 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 7490
Credit: 91,162,254
RAC: 46,319
Australia
Message 1280352 - Posted: 5 Sep 2012, 18:11:21 UTC - in response to Message 1280009.  


Getting lots of Scheduler time outs again.
Grant
Darwin NT
ID: 1280352 · Report as offensive
Profile Gatekeeper
Avatar

Send message
Joined: 14 Jul 04
Posts: 887
Credit: 176,479,616
RAC: 0
United States
Message 1280381 - Posted: 5 Sep 2012, 19:28:20 UTC - in response to Message 1280352.  


Getting lots of Scheduler time outs again.


Ditto. Of course, since AP's are being split, that explains it.
ID: 1280381 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 39,902,384
RAC: 27,960
United Kingdom
Message 1280537 - Posted: 6 Sep 2012, 10:50:14 UTC

The time outs are so bad i am out of work again
Just can`t get through the AP crush
ID: 1280537 · Report as offensive
Profile Slavac
Volunteer tester
Avatar

Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1280565 - Posted: 6 Sep 2012, 12:18:02 UTC - in response to Message 1280537.  

Anyone know what happened here?

http://setiathome.berkeley.edu/workunit.php?wuid=1060552647

Basically I had some tasks go into 'Timeout no response' within 2 hours of being sent to me. That's a bit odd.


Executive Director GPU Users Group Inc. -
brad@gpuug.org
ID: 1280565 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1790
Credit: 225,332,746
RAC: 10,480
Australia
Message 1280572 - Posted: 6 Sep 2012, 12:45:09 UTC - in response to Message 1280565.  

Anyone know what happened here?

http://setiathome.berkeley.edu/workunit.php?wuid=1060552647

Basically I had some tasks go into 'Timeout no response' within 2 hours of being sent to me. That's a bit odd.

Did your computer actually download these units or did the tasks get lost in Limbo due to the server load ?

50% of the units I'm getting atm are resends.

Someone mentioned in an earlier thread that reports with the client set to NNT get through. It's the reports with a request for new tasks that hang. This indicates an overload of the download servers.

T.A.
ID: 1280572 · Report as offensive
Profile Slavac
Volunteer tester
Avatar

Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1280577 - Posted: 6 Sep 2012, 13:11:58 UTC - in response to Message 1280572.  

Anyone know what happened here?

http://setiathome.berkeley.edu/workunit.php?wuid=1060552647

Basically I had some tasks go into 'Timeout no response' within 2 hours of being sent to me. That's a bit odd.

Did your computer actually download these units or did the tasks get lost in Limbo due to the server load ?

50% of the units I'm getting atm are resends.

Someone mentioned in an earlier thread that reports with the client set to NNT get through. It's the reports with a request for new tasks that hang. This indicates an overload of the download servers.

T.A.


No idea, I was asleep. I didn't abort them or anything, just checked my tasks this morning and saw a bunch of fresh errors.


Executive Director GPU Users Group Inc. -
brad@gpuug.org
ID: 1280577 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1280587 - Posted: 6 Sep 2012, 14:02:46 UTC - in response to Message 1280565.  

Anyone know what happened here?

http://setiathome.berkeley.edu/workunit.php?wuid=1060552647

Basically I had some tasks go into 'Timeout no response' within 2 hours of being sent to me. That's a bit odd.

The only known cause for early expiry of tasks is when the "Resend lost work" feature finds that for some reason it cannot assign some lost work to the host. In this case, your host is only doing MB on CUDA and AP on CPU. I think that at 6 Sep 2012 | 11:45:23 UTC it requested CPU work only, so 20 lost MB tasks which it had originally assigned to CUDA were expired because they couldn't be resent to CPU.
                                                                    Joe
ID: 1280587 · Report as offensive
Profile Slavac
Volunteer tester
Avatar

Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1280588 - Posted: 6 Sep 2012, 14:06:56 UTC - in response to Message 1280587.  

Anyone know what happened here?

http://setiathome.berkeley.edu/workunit.php?wuid=1060552647

Basically I had some tasks go into 'Timeout no response' within 2 hours of being sent to me. That's a bit odd.

The only known cause for early expiry of tasks is when the "Resend lost work" feature finds that for some reason it cannot assign some lost work to the host. In this case, your host is only doing MB on CUDA and AP on CPU. I think that at 6 Sep 2012 | 11:45:23 UTC it requested CPU work only, so 20 lost MB tasks which it had originally assigned to CUDA were expired because they couldn't be resent to CPU.
                                                                    Joe


Hrm it shouldn't have done that but thanks for the explanation. Was just trying to make sure I didn't do anything wrong.

Thanks Josef and T.A. :)


Executive Director GPU Users Group Inc. -
brad@gpuug.org
ID: 1280588 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 39,902,384
RAC: 27,960
United Kingdom
Message 1280748 - Posted: 6 Sep 2012, 21:07:53 UTC
Last modified: 6 Sep 2012, 21:11:09 UTC

Try this one for size :¬)
Task Self Destruct
sent - 6 Sep 2012 | 10:12:04 UTC
dead - 6 Sep 2012 | 10:20:01 UTC Timed out - no response
Just another imposible deadline
I think there was a thread about it not long ago.
ID: 1280748 · Report as offensive
Profile Keith MyersProject Donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 775
Credit: 107,937,293
RAC: 120,693
United States
Message 1280823 - Posted: 7 Sep 2012, 1:27:56 UTC

I've recently gotten a slew of "timed out no response" errors too. Some in as little as 6 minutes. I think those were because they were VLAR's getting assigned to the NVIDIA GPU's.

http://setiathome.berkeley.edu/workunit.php?wuid=1059826158

Keith
ID: 1280823 · Report as offensive
Profile Gatekeeper
Avatar

Send message
Joined: 14 Jul 04
Posts: 887
Credit: 176,479,616
RAC: 0
United States
Message 1280840 - Posted: 7 Sep 2012, 2:41:25 UTC - in response to Message 1278504.  

Two of my three boxes are OK, normal D/L and U/L, a bit of a lag reporting. The third is a complete mess; over 2000 waiting D/L's, and reporting is almost impossible. I've changed MTR from 256, to 100, and now 50, plus set NNT until I can get about 700 WU's reported.

EDIT: With 50 MTR and NNT, I got all 700 WU's reported. As soon as I removed NNT, BOINC tried to report two additional WU's and request new work....and timed out.

So it looks like any scheduler request with a work fetch is timing out.


No download issues ATM, but, same scheduler problem with reporting. The minute I set BOINC to NNT, the scheduler request goes through, almost immediately. Remove NNT and every request times out.
ID: 1280840 · Report as offensive
Rolf

Send message
Joined: 16 Jun 09
Posts: 114
Credit: 7,817,146
RAC: 0
Switzerland
Message 1280916 - Posted: 7 Sep 2012, 7:50:46 UTC - in response to Message 1280840.  

No download issues ATM, but, same scheduler problem with reporting. The minute I set BOINC to NNT, the scheduler request goes through, almost immediately. Remove NNT and every request times out.


Exactly same here!
ID: 1280916 · Report as offensive
Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · 21 · Next

Message boards : Number crunching : Panic Mode On (76) Server Problems?


 
©2016 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.