Panic Mode On (70) Server problems?

Author	Message
Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1203311 - Posted: 7 Mar 2012, 11:21:49 UTC Last modified: 7 Mar 2012, 11:25:35 UTC 07/03/2012 15:20:18 SETI@home Reporting 259 completed tasks, requesting new tasks for CPU and GPU 07/03/2012 15:20:40 Project communication failed: attempting access to reference site 07/03/2012 15:20:40 SETI@home Scheduler request failed: Couldn't connect to server 07/03/2012 15:24:46 SETI@home Reporting 261 completed tasks, requesting new tasks for CPU and GPU 07/03/2012 15:25:08 Project communication failed: attempting access to reference site 07/03/2012 15:25:08 SETI@home Scheduler request failed: Couldn't connect to server 07/03/2012 15:25:10 Internet access OK - project servers may be temporarily down. ID: 1203311 ·

LadyL Volunteer tester Send message Joined: 14 Sep 11 Posts: 1679 Credit: 5,230,097 RAC: 0	Message 1203313 - Posted: 7 Mar 2012, 11:28:30 UTC yes, Synergy has trouble handling the connections. It was running smoothly at first, I wonder what changed. I'm not the Pope. I don't speak Ex Cathedra! ID: 1203313 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1203317 - Posted: 7 Mar 2012, 11:56:03 UTC - in response to Message 1203313. yes, Synergy has trouble handling the connections. It was running smoothly at first, I wonder what changed. Probably too many hosts reporting at once, some of them would have been backed off earlier as the project was down for maintenance. Claggy ID: 1203317 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1203323 - Posted: 7 Mar 2012, 12:43:55 UTC I would vote for synergy just being over loaded. Look at the server status page. Half of the list is on synergy. Granted.. some of them are disabled, but it's still a lot of resource-intensive processes running simultaneously. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1203323 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1203338 - Posted: 7 Mar 2012, 13:56:59 UTC Looking back in my logs I see the expected "Scheduler request failed: HTTP gateway timeout" messages after the maintenance completed. Since the most of my requests are met with "Project has no tasks available" or "This computer has reached a limit on tasks in progress". It is a little odd seeing several no tasks message and then the limit message. It seems to me like the logic to check that would go before checking for available tasks. Through the logs I see the response for limit reached occurring on average much faster than the repose for no tasks. It doesn't seem to be a great difference. In the 0-3 second range for limit and 5-30 seconds for no tasks. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1203338 ·

cliff Send message Joined: 16 Dec 07 Posts: 625 Credit: 3,590,440 RAC: 0	Message 1203360 - Posted: 7 Mar 2012, 14:59:52 UTC And now at 14:58hrs GMT 07/03/2012 14:54:04 \| SETI@home \| Sending scheduler request: To fetch work. 07/03/2012 14:54:04 \| SETI@home \| Requesting new tasks for NVIDIA GPU 07/03/2012 14:55:27 \| SETI@home \| Scheduler request failed: HTTP internal server error 07/03/2012 14:57:29 \| SETI@home \| Sending scheduler request: To fetch work. 07/03/2012 14:57:29 \| SETI@home \| Reporting 1 completed tasks, requesting new tasks for NVIDIA GPU 07/03/2012 14:57:51 \| SETI@home \| Scheduler request failed: Couldn't connect to server 07/03/2012 14:57:54 \| \| Project communication failed: attempting access to reference site 07/03/2012 14:57:56 \| \| Internet access OK - project servers may be temporarily down. Are we back to square one? Cliff, Been there, Done that, Still no damm T shirt! ID: 1203360 ·

cliff Send message Joined: 16 Dec 07 Posts: 625 Credit: 3,590,440 RAC: 0	Message 1203372 - Posted: 7 Mar 2012, 16:06:36 UTC - in response to Message 1203370. Last modified: 7 Mar 2012, 16:09:01 UTC Just got 4 x GPU tasks.. smidgin after 08:00 PT:-) Someone [?Matt?] must have gotten in early and given a server or two a boot in the OS's:-) [edit] And 7 mins later boinc asks for more crunchies and gets told to go play with itself.. There aint no more available. Regards, Cliff, Been there, Done that, Still no damm T shirt! ID: 1203372 ·

LadyL Volunteer tester Send message Joined: 14 Sep 11 Posts: 1679 Credit: 5,230,097 RAC: 0	Message 1203394 - Posted: 7 Mar 2012, 16:57:31 UTC The more whining I hear about the limits, the more I am tempted to say next time a problem crops up 'sod it' and just leave you big guys to your own devices. I'm not the Pope. I don't speak Ex Cathedra! ID: 1203394 ·

red-ray Send message Joined: 24 Jun 99 Posts: 308 Credit: 9,029,848 RAC: 0	Message 1203405 - Posted: 7 Mar 2012, 17:27:25 UTC - in response to Message 1203379. Last modified: 7 Mar 2012, 17:30:07 UTC there always 200,000 WUs to 300,000 WUs to sent , but never get anything. 1 part of the problem is the tiny-minuscule-microscopic-subatomic-nothingness Server sub-cache which contain nothing, 2 persons on 500-3000 Querries/second get something and all the other ones dont get nothing-niet-nada. and you cannot ask again before 5 more minutes :( and if you dont get answer : you cannot ask again before another 5 minutes. that Server cache NEEDS to be more than double, needs to be 10X times bigger. if it s 100 tasks cached per second: need to be 1000 if it s 1000 tasks cached per minute: need to be 10,000 ! another part of the problem is the little limits of task a PC get, we arent in 1990s anymore. the PC crunchers have 20X-200X-2000X the processing power we've got last century. the maximum task should be based on the RAC the PC has and not on a max limit whatever the power you have. With a RAC of 19,743.64 I can't see why you have a big problem with the current limits! ID: 1203405 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304	Message 1203435 - Posted: 7 Mar 2012, 18:38:02 UTC - in response to Message 1203405. Network traffic is still very ragged, and my log is full of "Scheduler request failed: Couldn't connect to server" & "Scheduler request failed: HTTP internal server" errors. Grant Darwin NT ID: 1203435 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304	Message 1203436 - Posted: 7 Mar 2012, 18:39:31 UTC - in response to Message 1203394. The more whining I hear about the limits, the more I am tempted to say next time a problem crops up 'sod it' and just leave you big guys to your own devices. It would be nice if they could sort out the DCF problem. It's been a while now. Grant Darwin NT ID: 1203436 ·

Bernie Vine Volunteer moderator Volunteer tester Send message Joined: 26 May 99 Posts: 9954 Credit: 103,452,613 RAC: 328	Message 1203438 - Posted: 7 Mar 2012, 18:43:21 UTC No errors on my 5 machines just 1513 SETI@home 07/03/2012 18:31:02 Project has no tasks available ID: 1203438 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304	Message 1203439 - Posted: 7 Mar 2012, 18:46:22 UTC - in response to Message 1203438. No errors on my 5 machines just 1513 SETI@home 07/03/2012 18:31:02 Project has no tasks available I've got a few of those, along with the odd request that does result in work. But most of the requests result in an error message. Grant Darwin NT ID: 1203439 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1203443 - Posted: 7 Mar 2012, 19:06:54 UTC - in response to Message 1203442. Last modified: 7 Mar 2012, 19:07:15 UTC With a RAC of 19,743.64 I can't see why you have a big problem with the current limits! if it s me : i have 27k RAC (24k seti) and should be : 35k seti - 0 everything else He was referring to your fastest computer, not your total. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1203443 ·

red-ray Send message Joined: 24 Jun 99 Posts: 308 Credit: 9,029,848 RAC: 0	Message 1203488 - Posted: 7 Mar 2012, 21:15:31 UTC - in response to Message 1203442. With a RAC of 19,743.64 I can't see why you have a big problem with the current limits! if it s me : i have 27k RAC (24k seti) and should be : 35k seti - 0 everything else It's the per computer RAC that matters which are 19,818.22 and 7,865.20. No sensible regime could be based on the overall RAC. ID: 1203488 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22221 Credit: 416,307,556 RAC: 380	Message 1203489 - Posted: 7 Mar 2012, 21:16:29 UTC Last modified: 7 Mar 2012, 21:21:56 UTC The number of tasks available for distribution sits at around 200k. There are upper and lower limits in place and the pool cycles between the two limits. They are distributed in lots of 100. When a lot is assigned to a cruncher another lot of 100 is requested. If you make your request for new work when there are some in the available to be assigned you will get some, otherwise you will get the message about no work available. Some crunchers are far better at hitting the short window of work being available than others - its a fact of life... (One of my crunchers gets work about two out of three attempts, the other about one in five - there are no prizes for guessing which one wants the most work....) RAC (Recent Average Credit) is a sort of rolling average, it is supposed to smooth out the lumps and bumps, but is very much more sensitive to a period of low credit, such as happens when there is an outage than a sudden burst of high credit such as might happen shortly after an extended outage such as the one we've just been through - if you are really that anxious about your rate of credit accrual then it would be far better to generate your own figures from the raw data, this will show trends more clearly. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 1203489 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874	Message 1203492 - Posted: 7 Mar 2012, 21:28:21 UTC - in response to Message 1203489. The number of tasks available for distribution sits at around 200k. There are upper and lower limits in place and the pool cycles between the two limits. They are distributed in lots of 100. When a lot is assigned to a cruncher another lot of 100 is requested. If you make your request for new work when there are some in the available to be assigned you will get some, otherwise you will get the message about no work available. Some crunchers are far better at hitting the short window of work being available than others - its a fact of life... (One of my crunchers gets work about two out of three attempts, the other about one in five - there are no prizes for guessing which one wants the most work....) That seems a common observation. I do wonder if what happens with a large request from a fast computer comes in might be: Quick check to see if there are any in the 100 feeder-lot. OK, there are, we can continue. Long time spent reconciling the work in progress with work allocated, seeing if any need to be resent. Long time spend housekeeping on the work being reported, and acknowledging it. Er, what was the question again? Ooops, they've all gone - those slippery little 1-WU requests have slipped in and out again, emptying the pot. LOL. ID: 1203492 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1203529 - Posted: 8 Mar 2012, 0:32:52 UTC My AP-only cache is nearing empty. I've got a little more than 1 full day left. It was full at 10 days before the problems arose. I'll be idle waiting for the v6 theory to be implemented. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1203529 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304	Message 1203593 - Posted: 8 Mar 2012, 5:06:48 UTC - in response to Message 1202414. Last modified: 8 Mar 2012, 5:10:25 UTC Still getting "Scheduler request failed: Couldn't connect to server" messages, but not as many as i had been. Now it's mostly "Project has no tasks available" messages. Every now & then i get a Wu or 2. Network traffic is still looking very ragged. Grant Darwin NT ID: 1203593 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1203614 - Posted: 8 Mar 2012, 7:28:19 UTC My single core machine is keeping full. Every time it asks for work, it gets 1 MB. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1203614 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.