Panic Mode On (57) Server problems?

Author	Message
arkayn Volunteer tester Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0	Message 1158065 - Posted: 2 Oct 2011, 0:25:49 UTC Server status..... ID: 1158065 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13732 Credit: 208,696,464 RAC: 304	Message 1158090 - Posted: 2 Oct 2011, 1:34:15 UTC - in response to Message 1158065. Still no response from the Scheduler. Grant Darwin NT ID: 1158090 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1158095 - Posted: 2 Oct 2011, 1:43:20 UTC Scheduler responds for me. Getting nothing but "no tasks available." What I find sketchy is the ready-to-send queue is building, splitters are steaming right along, but cricket is pretty calm and lots of "no tasks available." Did they drop the feeder cache to something like 10 every 2 seconds? Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1158095 ·

Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489	Message 1158097 - Posted: 2 Oct 2011, 1:51:52 UTC - in response to Message 1158095. Well I'm connecting again and receiving work again so I'm happy. Cheers. ID: 1158097 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13732 Credit: 208,696,464 RAC: 304	Message 1158100 - Posted: 2 Oct 2011, 1:55:42 UTC - in response to Message 1158097. Well I'm connecting again and receiving work again so I'm happy. Cheers. Tried it again just then. This time it conected. Project still has not tasks available though. Grant Darwin NT ID: 1158100 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13732 Credit: 208,696,464 RAC: 304	Message 1158101 - Posted: 2 Oct 2011, 1:57:43 UTC - in response to Message 1158100. Last modified: 2 Oct 2011, 2:04:00 UTC Well I'm connecting again and receiving work again so I'm happy. Cheers. Tried it again just then. This time it conected. Project still has not tasks available though. EDIT- just tried it on the other machine & no response fom the Scheduler. Grant Darwin NT ID: 1158101 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13732 Credit: 208,696,464 RAC: 304	Message 1158103 - Posted: 2 Oct 2011, 1:59:25 UTC - in response to Message 1158101. Last modified: 2 Oct 2011, 2:03:46 UTC Well I'm connecting again and receiving work again so I'm happy. Cheers. Tried it again just then. This time it conected. Project still has not tasks available though. EDIT- just tried it on the other machine & no response fom the Scheduler. EDIT again- just tried it again & not only got a response but got 30 WUs. Things really are seriously screwed. Not just the flakey router but the servers as well. Yet another EDIT- 1st machine got a "server returned nothing(no headers, no data)" response from the Scheduler. The next time it tried it got a couple of WUs. Grant Darwin NT ID: 1158103 ·

Terror Australis Volunteer tester Send message Joined: 14 Feb 04 Posts: 1817 Credit: 262,693,308 RAC: 44	Message 1158134 - Posted: 2 Oct 2011, 3:44:32 UTC If I was Admin in this situation, I would be completely incoherent. These are times that Sadms live on Cheeto's and Mountain Dew. Or Kit Kats and Coca Cola (sugar and caffeine mmmmmmmm) ID: 1158134 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34257 Credit: 79,922,639 RAC: 80	Message 1158162 - Posted: 2 Oct 2011, 7:43:18 UTC Team, This following messages are typical today... 10/1/2011 7:40:19 PM \| SETI@home \| update requested by user 10/1/2011 7:40:21 PM \| SETI@home \| Sending scheduler request: Requested by user. 10/1/2011 7:40:21 PM \| SETI@home \| Requesting new tasks for NVIDIA GPU 10/1/2011 7:40:25 PM \| SETI@home \| Scheduler request completed: got 0 new tasks 10/1/2011 7:40:25 PM \| SETI@home \| No tasks sent 10/1/2011 7:40:25 PM \| SETI@home \| No tasks are available for SETI@home Enhanced 10/1/2011 7:40:25 PM \| SETI@home \| Tasks for CPU are available, but your preferences are set to not accept them Jeff I wonder if that means that there's a lot of VLAR's around, (VLAR's aren't sent to Nvidia GPU's) Out from 88 units iÂ´ve downloaded over night there was only 6 VLARs. Most VHARs and a few mid range units. With each crime and every kindness we birth our future. ID: 1158162 ·

soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0	Message 1158175 - Posted: 2 Oct 2011, 9:05:17 UTC scheduler attempts are up to flakey. Some go through, others different types of errors/no connection and some go through, sometimes yielding units sometimes not. All in all cache is building slowly, and uploads and downloads are flying through for me. Janice ID: 1158175 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1158177 - Posted: 2 Oct 2011, 9:21:00 UTC - in response to Message 1158162. Team, This following messages are typical today... 10/1/2011 7:40:19 PM \| SETI@home \| update requested by user 10/1/2011 7:40:21 PM \| SETI@home \| Sending scheduler request: Requested by user. 10/1/2011 7:40:21 PM \| SETI@home \| Requesting new tasks for NVIDIA GPU 10/1/2011 7:40:25 PM \| SETI@home \| Scheduler request completed: got 0 new tasks 10/1/2011 7:40:25 PM \| SETI@home \| No tasks sent 10/1/2011 7:40:25 PM \| SETI@home \| No tasks are available for SETI@home Enhanced 10/1/2011 7:40:25 PM \| SETI@home \| Tasks for CPU are available, but your preferences are set to not accept them Jeff I wonder if that means that there's a lot of VLAR's around, (VLAR's aren't sent to Nvidia GPU's) Out from 88 units iÂ´ve downloaded over night there was only 6 VLARs. Most VHARs and a few mid range units. I think Jeff should check his project preferences then, and check that SETI@home Enhanced is still selected, then report back, Claggy ID: 1158177 ·

__W__ Send message Joined: 28 Mar 09 Posts: 116 Credit: 5,943,642 RAC: 0	Message 1158201 - Posted: 2 Oct 2011, 11:13:18 UTC ooh, ohhh - i think their is trouble ahead. I never have seen this values at this level: Results waiting for db purging 10,049,030 MB# 113,856 AP#. And db_purge.x86_64 on vader is at "Disabled". How many WUs could be handled by this database? __W__ _______________________________________________________________________________ ID: 1158201 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1158209 - Posted: 2 Oct 2011, 12:04:16 UTC - in response to Message 1158201. ooh, ohhh - i think their is trouble ahead. I never have seen this values at this level: Results waiting for db purging 10,049,030 MB# 113,856 AP#. And db_purge.x86_64 on vader is at "Disabled". How many WUs could be handled by this database? __W__ That's a good question. From what I know of databases in general, I doubt we'll hit any problems with the absolute numbers - databases are designed to grow until they fill all the available disk space. Bur what happens is that they get slower and slower as they get bigger. In particular for SETI, Matt has mentioned in the past then when the database gets too big for key tables (and indexes) to be held in RAM, then the performance is really hammered by having to swap pages at disk I/O speed. I think I'm seeing that already, if I ever look at the full task list for one of my hosts. ID: 1158209 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1158211 - Posted: 2 Oct 2011, 12:07:22 UTC - in response to Message 1158209. ooh, ohhh - i think their is trouble ahead. I never have seen this values at this level: Results waiting for db purging 10,049,030 MB# 113,856 AP#. And db_purge.x86_64 on vader is at "Disabled". How many WUs could be handled by this database? __W__ That's a good question. From what I know of databases in general, I doubt we'll hit any problems with the absolute numbers - databases are designed to grow until they fill all the available disk space. Bur what happens is that they get slower and slower as they get bigger. In particular for SETI, Matt has mentioned in the past then when the database gets too big for key tables (and indexes) to be held in RAM, then the performance is really hammered by having to swap pages at disk I/O speed. I think I'm seeing that already, if I ever look at the full task list for one of my hosts. I have noticed the slow response when viewing task lists as well. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1158211 ·

__W__ Send message Joined: 28 Mar 09 Posts: 116 Credit: 5,943,642 RAC: 0	Message 1158212 - Posted: 2 Oct 2011, 12:13:32 UTC - in response to Message 1158209. This maybe the reason why some crunchers could not get WUs up to the limits dispite cricket is not at a high level - at least connection to Berkely is now good and fast from my point of view. __W__ _______________________________________________________________________________ ID: 1158212 ·

JohnDK Volunteer tester Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127	Message 1158214 - Posted: 2 Oct 2011, 12:32:20 UTC So.. I keep hearing about this limit of 50/cpu... my single core machine has 99 in progress and the messages tab says nothing about a limit. Hm. Now it's 100, still no message. Either it's a glitch, or it's because a 10-day cache on this machine is somewhere between 90-110 MBs on average, at least with the weird estimates and most of them being shorties. I'm keep hitting the limit on 2 PCs. What I don't understand is why BOINC still keep asking for CPU work, just now it has asked for new CPU work 5 times in a row. Wouldn't it be better if it waits until a CPU task is finished before asking for more? ID: 1158214 ·

__W__ Send message Joined: 28 Mar 09 Posts: 116 Credit: 5,943,642 RAC: 0	Message 1158220 - Posted: 2 Oct 2011, 12:40:23 UTC - in response to Message 1158214. What I don't understand is why BOINC still keep asking for CPU work, just now it has asked for new CPU work 5 times in a row. Wouldn't it be better if it waits until a CPU task is finished before asking for more? You cache is set to a higher level so Boinc has to ask for more WUs. The 50/400 WUs limit are set at serverside and so Boinc don't "know" about them and can't handle them. __W__ _______________________________________________________________________________ ID: 1158220 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1158221 - Posted: 2 Oct 2011, 12:46:49 UTC - in response to Message 1158214. So.. I keep hearing about this limit of 50/cpu... my single core machine has 99 in progress and the messages tab says nothing about a limit. Hm. Now it's 100, still no message. Either it's a glitch, or it's because a 10-day cache on this machine is somewhere between 90-110 MBs on average, at least with the weird estimates and most of them being shorties. I'm keep hitting the limit on 2 PCs. What I don't understand is why BOINC still keep asking for CPU work, just now it has asked for new CPU work 5 times in a row. Wouldn't it be better if it waits until a CPU task is finished before asking for more? BOINC hasn't been programmed to take any notice of the reason why no work is issued - even when the reason is stated (it isn't always). The newer BOINC v6.12.xx are programmed to back off and ask less frequently when no work is forthcoming - but the backoff is reset to zero when a task completes, and BOINC allows time for the just-completed task to upload and be available for reporting before requesting new work. So, its behaviour is quite close to what you're suggesting: if you have reached the quota limit, there's a fair likelyhood that your next scheduler request will be 'report one, get one replacement'. You, on the other hand, are running BOINC v6.10.58/60 - that version of BOINC simply knows how much work you've said that you'd like to have, and keeps asking 'more, more, more', even when the server is replying 'no, no, no'. If you are concerned about the strain that your repeated fruitless requests are placing on the servers, you could consider upgrading to BOINC v6.12.34, or temporarily reduce you cache size to something which matches the current (temporary) quota limit more closely. ID: 1158221 ·

JohnDK Volunteer tester Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127	Message 1158226 - Posted: 2 Oct 2011, 12:55:49 UTC I don't like v6.12.* so I will reduce my cache setting for now :) ID: 1158226 ·

__W__ Send message Joined: 28 Mar 09 Posts: 116 Credit: 5,943,642 RAC: 0	Message 1158227 - Posted: 2 Oct 2011, 13:03:08 UTC - in response to Message 1158221. BOINC hasn't been programmed to take any notice of the reason why no work is issued - even when the reason is stated (it isn't always). The newer BOINC v6.12.xx are programmed to back off and ask less frequently when no work is forthcoming - but the backoff is reset to zero when a task completes, and BOINC allows time for the just-completed task to upload and be available for reporting before requesting new work. So, its behaviour is quite close to what you're suggesting: if you have reached the quota limit, there's a fair likelyhood that your next scheduler request will be 'report one, get one replacement'. You, on the other hand, are running BOINC v6.10.58/60 - that version of BOINC simply knows how much work you've said that you'd like to have, and keeps asking 'more, more, more', even when the server is replying 'no, no, no'. If you are concerned about the strain that your repeated fruitless requests are placing on the servers, you could consider upgrading to BOINC v6.12.34, or temporarily reduce you cache size to something which matches the current (temporary) quota limit more closely. That's the much more detailed explanation, thanks ;-) __W__ _______________________________________________________________________________ ID: 1158227 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.