Panic Mode On (64) Server problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (64) Server problems?

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 11 · Next
Author Message
Blake Bonkofsky
Volunteer tester
Avatar
Send message
Joined: 29 Dec 99
Posts: 617
Credit: 46,332,781
RAC: 0
United States
Message 1181712 - Posted: 30 Dec 2011, 23:45:10 UTC - in response to Message 1181690.
Last modified: 30 Dec 2011, 23:46:01 UTC

Limits are still 50 per logical CPU Core and 400 per GPU as far as I know. Your limit should be 1000 overall. If it's only asking for GPU work, and you are already at 800 for them, it will give you that message even if the CPU is under the limit.
____________

Dave Stegner
Volunteer tester
Avatar
Send message
Joined: 20 Oct 04
Posts: 476
Credit: 41,308,954
RAC: 6,471
United States
Message 1181854 - Posted: 31 Dec 2011, 9:00:22 UTC

Looks like things are getting a little dicey.

Getting lots of servers may be down, etc.

Graphs are taking dips.


____________
Dave

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5917
Credit: 61,705,858
RAC: 21,775
Australia
Message 1181867 - Posted: 31 Dec 2011, 10:46:53 UTC - in response to Message 1181854.


Yeah, not a good way to start the new year.
Traffic has dropped off- All recent requests for work have resulted in "Project has no tasks available" messages.
____________
Grant
Darwin NT.

MikeN
Send message
Joined: 24 Jan 11
Posts: 302
Credit: 32,822,708
RAC: 5,535
United Kingdom
Message 1181879 - Posted: 31 Dec 2011, 11:45:45 UTC

Looks like the splitters have all run out of AP workunits. As a result, downloads are taking less bandwidth hence the dip in the cricket graph. Everything else OK though, plenty of MB WUs available and uploads and reporting still going through OK
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5917
Credit: 61,705,858
RAC: 21,775
Australia
Message 1182010 - Posted: 31 Dec 2011, 23:00:53 UTC - in response to Message 1181879.

Looks like the splitters have all run out of AP workunits. As a result, downloads are taking less bandwidth hence the dip in the cricket graph.

Nope- there was a problem for a few hours there where work wasn't being allocated when requested, but it appears to have sorted itself out for now.

____________
Grant
Darwin NT.

Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 7920
Credit: 98,327,381
RAC: 28,719
Australia
Message 1182024 - Posted: 31 Dec 2011, 23:49:22 UTC - in response to Message 1182010.

Looks like the splitters have all run out of AP workunits. As a result, downloads are taking less bandwidth hence the dip in the cricket graph.

Nope- there was a problem for a few hours there where work wasn't being allocated when requested, but it appears to have sorted itself out for now.

I think that the problem was more with part of the connection going dickie again than the servers as as soon as I selected another proxy things cleared up real quickly.

Cheers.
____________

Dave
Avatar
Send message
Joined: 29 Mar 02
Posts: 774
Credit: 23,193,139
RAC: 0
United Kingdom
Message 1182027 - Posted: 1 Jan 2012, 0:18:52 UTC

Never mind SETI problems, a very happy new year to all the people here from the Greenwich Meridian :).

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5917
Credit: 61,705,858
RAC: 21,775
Australia
Message 1182039 - Posted: 1 Jan 2012, 1:15:17 UTC - in response to Message 1182024.

Looks like the splitters have all run out of AP workunits. As a result, downloads are taking less bandwidth hence the dip in the cricket graph.

Nope- there was a problem for a few hours there where work wasn't being allocated when requested, but it appears to have sorted itself out for now.

I think that the problem was more with part of the connection going dickie again than the servers as as soon as I selected another proxy things cleared up real quickly.

Co-incidence.
If you look at the graphs you can see that both uploads & downloads went goofy.
Also the connection you use has no effect on whether work is allocated or not. I had no problems with Scheduler requests or uploads, but i was unable to download anything because nothing was allocacted when requested- just "Project has no tasks available" messages. Those are system related mesages, not network.
____________
Grant
Darwin NT.

Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 7920
Credit: 98,327,381
RAC: 28,719
Australia
Message 1182040 - Posted: 1 Jan 2012, 1:21:29 UTC - in response to Message 1182039.
Last modified: 1 Jan 2012, 1:22:44 UTC

Looks like the splitters have all run out of AP workunits. As a result, downloads are taking less bandwidth hence the dip in the cricket graph.

Nope- there was a problem for a few hours there where work wasn't being allocated when requested, but it appears to have sorted itself out for now.

I think that the problem was more with part of the connection going dickie again than the servers as as soon as I selected another proxy things cleared up real quickly.

Co-incidence.
If you look at the graphs you can see that both uploads & downloads went goofy.
Also the connection you use has no effect on whether work is allocated or not. I had no problems with Scheduler requests or uploads, but i was unable to download anything because nothing was allocacted when requested- just "Project has no tasks available" messages. Those are system related mesages, not network.

I know that it went goofy but as I said before, as soon I changed back to using proxies the problems stopped for me even though the graphs stayed goofy for a few hours after that but during that time I was uploading, reporting and downloading as usual using proxies (I also use a different proxy for each rig).

Cheers.
____________

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,902,797
RAC: 257
Netherlands
Message 1182055 - Posted: 1 Jan 2012, 2:41:42 UTC - in response to Message 1182040.

Looks like the splitters have all run out of AP workunits. As a result, downloads are taking less bandwidth hence the dip in the cricket graph.

Nope- there was a problem for a few hours there where work wasn't being allocated when requested, but it appears to have sorted itself out for now.

I think that the problem was more with part of the connection going dickie again than the servers as as soon as I selected another proxy things cleared up real quickly.

Co-incidence.
If you look at the graphs you can see that both uploads & downloads went goofy.
Also the connection you use has no effect on whether work is allocated or not. I had no problems with Scheduler requests or uploads, but i was unable to download anything because nothing was allocacted when requested- just "Project has no tasks available" messages. Those are system related mesages, not network.

I know that it went goofy but as I said before, as soon I changed back to using proxies the problems stopped for me even though the graphs stayed goofy for a few hours after that but during that time I was uploading, reporting and downloading as usual using proxies (I also use a different proxy for each rig).

Cheers.



Every morning around 8 or 9, I see 3 hosts, all with a large DLoad qeue and some 8
hours, before a retry. But 1 push almost everytime is enough to download the qeue, at 10-35Kbit and some additional work.
Almost Standard Procedure for 3 hosts, 2 CUDA/OpenCL* (GTX470*&480) and 1 (CAL)ATI OpenCL, 2x5870GPUs;CPU is an I7-2600(stock;DDR3-1333)

Also use a small cache, why big, you'll have a hardtime filling it, depending on which version of BOINC, 6.10.60 is more "consistent in getting work" and 6.12.34,
if you let it run, does supply you enough work, although no cache is build.
First In---> First Out,(FIFO) .
The use of a Proxy, shouldn't be necessary, but can depend on a lot of reasons, why
you have to use them.
I still have a 'FIXED' IP adress and a 'SmartModemRouter', it operates with an embedded version of *NUX/NIX. (As does my Digital TV STB, only uses different Coding/DECoding algoritmes)!

Two problems, Banwidth and Telescope time, very expensive, so SETI@home
"piggi-backs" and so changes position.
Anotherthing, when looking at the date of the work Result 2239662802, is quite Up to Date.
(A 'crunching farm' near Arecibo or a WLAN Transmitter to the Berkeley Lab!?).

Here, it's already 3 1/2 hours 2012, wishing you the best of health, bright future, opportunities, luck........


____________

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3684
Credit: 21,180,938
RAC: 6,168
Sweden
Message 1182176 - Posted: 1 Jan 2012, 18:22:52 UTC

What's up with the Replica database? It's over 4 hours behind the master database now?

Are we in for a belly up session?
____________

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3684
Credit: 21,180,938
RAC: 6,168
Sweden
Message 1182340 - Posted: 2 Jan 2012, 4:31:12 UTC

Panic mode suspended. AP being split and assigned again.
____________

Dave Stegner
Volunteer tester
Avatar
Send message
Joined: 20 Oct 04
Posts: 476
Credit: 41,308,954
RAC: 6,471
United States
Message 1182348 - Posted: 2 Jan 2012, 5:18:18 UTC

AP's may be being split but, I dont think we are going to get many.

SLWS010
1/1/2012 21:11:39 Requesting new tasks for CPU
1/1/2012 21:12:49 Scheduler request failed: Failure when receiving data from the peer
1/1/2012 21:12:50 Project communication failed: attempting access to reference site
1/1/2012 21:12:51 Internet access OK - project servers may be temporarily down.
1/1/2012 21:12:58 update requested by user
1/1/2012 21:12:59 Sending scheduler request: Requested by user.
1/1/2012 21:12:59 Requesting new tasks for CPU
1/1/2012 21:13:21 Scheduler request failed: Failure when receiving data from the peer
1/1/2012 21:13:22 Project communication failed: attempting access to reference site
1/1/2012 21:13:23 Internet access OK - project servers may be temporarily down.
1/1/2012 21:13:41 update requested by user
1/1/2012 21:13:46 Sending scheduler request: Requested by user.
1/1/2012 21:13:46 Requesting new tasks for CPU
1/1/2012 21:14:10 Project communication failed: attempting access to reference site
1/1/2012 21:14:10 Scheduler request failed: Failure when receiving data from the peer
1/1/2012 21:14:11 Internet access OK - project servers may be temporarily down.

____________
Dave

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5917
Credit: 61,705,858
RAC: 21,775
Australia
Message 1182351 - Posted: 2 Jan 2012, 5:33:49 UTC - in response to Message 1182348.


I've got no problems contacting the Scheduler, it just doesn't want to give me any work.
"Project has no tasks available" is the usual response.
____________
Grant
Darwin NT.

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3684
Credit: 21,180,938
RAC: 6,168
Sweden
Message 1182354 - Posted: 2 Jan 2012, 5:39:19 UTC - in response to Message 1182348.

AP's may be being split but, I dont think we are going to get many.

SLWS010
1/1/2012 21:11:39 Requesting new tasks for CPU
1/1/2012 21:12:49 Scheduler request failed: Failure when receiving data from the peer
1/1/2012 21:12:50 Project communication failed: attempting access to reference site
1/1/2012 21:12:51 Internet access OK - project servers may be temporarily down.
1/1/2012 21:12:58 update requested by user
1/1/2012 21:12:59 Sending scheduler request: Requested by user.
1/1/2012 21:12:59 Requesting new tasks for CPU
1/1/2012 21:13:21 Scheduler request failed: Failure when receiving data from the peer
1/1/2012 21:13:22 Project communication failed: attempting access to reference site
1/1/2012 21:13:23 Internet access OK - project servers may be temporarily down.
1/1/2012 21:13:41 update requested by user
1/1/2012 21:13:46 Sending scheduler request: Requested by user.
1/1/2012 21:13:46 Requesting new tasks for CPU
1/1/2012 21:14:10 Project communication failed: attempting access to reference site
1/1/2012 21:14:10 Scheduler request failed: Failure when receiving data from the peer
1/1/2012 21:14:11 Internet access OK - project servers may be temporarily down.


Yeah, I get those failures too, but in between them I also get some APs

____________

Dave Stegner
Volunteer tester
Avatar
Send message
Joined: 20 Oct 04
Posts: 476
Credit: 41,308,954
RAC: 6,471
United States
Message 1182355 - Posted: 2 Jan 2012, 5:41:00 UTC

I'm getting none available also.

I have gotten a couple of AP's.


____________
Dave

Profile john3760
Avatar
Send message
Joined: 9 Feb 11
Posts: 334
Credit: 3,400,979
RAC: 0
United Kingdom
Message 1182359 - Posted: 2 Jan 2012, 6:17:16 UTC - in response to Message 1182355.

I just got this

02/01/2012 06:12:17 SETI@home Scheduler request completed: got 0 new tasks
02/01/2012 06:12:17 SETI@home Message from server: No tasks sent
02/01/2012 06:12:17 SETI@home Message from server: No tasks are available for Astropulse v505


john3760

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2326
Credit: 8,868,690
RAC: 808
United States
Message 1182361 - Posted: 2 Jan 2012, 6:34:58 UTC

I got this a few minutes ago:

2012-01-02 00:55:41|SETI@home|Sending scheduler request: To fetch work. Requesting 179988 seconds of work, reporting 0 completed tasks
2012-01-02 00:55:46|SETI@home|Scheduler request succeeded: got 2 new tasks
2012-01-02 00:55:49|SETI@home|Started download of ap_15oc11ac_B1_P1_00213_20120101_29547.wu
2012-01-02 00:55:49|SETI@home|Started download of ap_18oc11ag_B0_P0_00231_20120101_23563.wu


GMT-5.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Kevin Olley
Send message
Joined: 3 Aug 99
Posts: 368
Credit: 35,328,637
RAC: 670
United Kingdom
Message 1182414 - Posted: 2 Jan 2012, 16:03:02 UTC

Its looking good, sitting on the GPU limits and not ONE shorty in sight.



____________
Kevin


Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3684
Credit: 21,180,938
RAC: 6,168
Sweden
Message 1182631 - Posted: 3 Jan 2012, 19:45:52 UTC
Last modified: 3 Jan 2012, 20:24:05 UTC

Now that was a fast and furious outage. :-)

Edit: Now all we need is for the scheduling server, and the scheduler process to come online.

Well, and more APs being split of course...

Edit 2, 40 minutes later: Scheduling server, and scheduler process, where are you? Hallo, hallooooooo.....
____________

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 11 · Next

Message boards : Number crunching : Panic Mode On (64) Server problems?

Copyright © 2014 University of California