Panic Mode On (81) Server Problems?

Author	Message
Kathy Send message Joined: 5 Jan 03 Posts: 338 Credit: 27,877,436 RAC: 0	Message 1335459 - Posted: 7 Feb 2013, 15:28:36 UTC Haven't been able to connect for 3 days now, and get the same messages every time: 2/7/2013 10:25:47 AM \| SETI@home \| Reporting 81 completed tasks, requesting new tasks for CPU and ATI 2/7/2013 10:26:11 AM \| SETI@home \| Scheduler request failed: Couldn't connect to server 2/7/2013 10:26:15 AM \| \| Project communication failed: attempting access to reference site 2/7/2013 10:26:16 AM \| \| Internet access OK - project servers may be temporarily down. ID: 1335459 ·

James Sotherden Send message Joined: 16 May 99 Posts: 10436 Credit: 110,373,059 RAC: 54	Message 1335460 - Posted: 7 Feb 2013, 15:30:55 UTC Last modified: 7 Feb 2013, 15:32:00 UTC Yes it seems lately we have had a lot of problems. No work,ghost work units, Cant report, Cant download, Cant upload. Or download speeds so slow you could go to the lab and get them faster inperson. I can assure you that no one likes it. Not even the lab crew. But when you are the Number 1 Boinc project in terms of active members and underfunded its understandable. I do other projects when I have no work. Id rather not, but I do just to keep my computers busy. And they are worthy projects in their own right. [/quote] Old James ID: 1335460 ·

Cliff Harding Volunteer tester Send message Joined: 18 Aug 99 Posts: 1432 Credit: 110,967,840 RAC: 67	Message 1335483 - Posted: 7 Feb 2013, 16:34:53 UTC Don't know what happened, just woke up from dozing off for about an hour and noticed that the scheduler has dumped almost a full fuel load into my fastest rig, finishing as I opened my eyes. Now if it would only bless my other rig, both machines will quit bitc(&^$$^%%$inhg. I don't buy computers, I build them!! ID: 1335483 ·

Gone Send message Joined: 31 May 99 Posts: 150 Credit: 125,779,206 RAC: 0	Message 1335485 - Posted: 7 Feb 2013, 16:43:45 UTC - in response to Message 1335483. Don't know what happened, just woke up from dozing off for about an hour and noticed that the scheduler has dumped almost a full fuel load into my fastest rig, finishing as I opened my eyes. Now if it would only bless my other rig, both machines will quit bitc(&^$$^%%$inhg. Similar thing just happened to me, except I was only dreaming. When I awoke all was still broken ... :) ID: 1335485 ·

Cliff Harding Volunteer tester Send message Joined: 18 Aug 99 Posts: 1432 Credit: 110,967,840 RAC: 67	Message 1335490 - Posted: 7 Feb 2013, 16:55:38 UTC - in response to Message 1335485. Don't know what happened, just woke up from dozing off for about an hour and noticed that the scheduler has dumped almost a full fuel load into my fastest rig, finishing as I opened my eyes. Now if it would only bless my other rig, both machines will quit bitc(&^$$^%%$inhg. Similar thing just happened to me, except I was only dreaming. When I awoke all was still broken ... :) My luck is still holding, just got 2 more Opencl units since my initial post. I don't buy computers, I build them!! ID: 1335490 ·

fscheel Send message Joined: 13 Apr 12 Posts: 73 Credit: 11,135,641 RAC: 0	Message 1335495 - Posted: 7 Feb 2013, 17:09:51 UTC Sure would be nice to have a place to get some information as to what is being done to correct this issue. Or is there and I just don't know where to look? ID: 1335495 ·

KWSN Ekky Ekky Ekky Send message Joined: 25 May 99 Posts: 944 Credit: 52,956,491 RAC: 67	Message 1335496 - Posted: 7 Feb 2013, 17:19:32 UTC - in response to Message 1335495. Sure would be nice to have a place to get some information as to what is being done to correct this issue. Or is there and I just don't know where to look? This is the place. Welcome to Room 101. ID: 1335496 ·

fscheel Send message Joined: 13 Apr 12 Posts: 73 Credit: 11,135,641 RAC: 0	Message 1335500 - Posted: 7 Feb 2013, 17:27:22 UTC - in response to Message 1335496. Sure would be nice to have a place to get some information as to what is being done to correct this issue. Or is there and I just don't know where to look? This is the place. Welcome to Room 101. Thanks. Guess I need a crash course in reading comprehension. :) ID: 1335500 ·

BarryAZ Send message Joined: 1 Apr 01 Posts: 2580 Credit: 16,982,517 RAC: 0	Message 1335501 - Posted: 7 Feb 2013, 17:28:16 UTC - in response to Message 1335495. Last modified: 7 Feb 2013, 17:47:59 UTC I had been running NNT to trim out the existing work units. I don't run GPU work units from SETI for a number of reasons so I suppose my frustration level is lower than many. These days, as the project repeatedly bounces its collective head against the AP mass traffic tie up - something which has happened many times over the past few months without a resolution -- when I encounter the 'Dead SETI scenaria, I simply suspend the project on the handful of workstations still running SETI and feed projects that don't suffer from this sort of persistent reliability problem. Perhaps at some point the folks back at the project will figure out that running the high traffic volume AP work units along with the worthless less than one minute CPU work units only serves to reduce the projects effective work and thus might be best avoided. Until then, I figure to watch closely, run and report in remaining work during the increasingly rare full functionality periods and once that clears, simply watch to see if a project learning process is demonstrated. Alternatively, it might be part of a grand plan at the project to reduce problems by pushing away users until the traffic levels are low enough to be sustained. ID: 1335501 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1335513 - Posted: 7 Feb 2013, 17:59:36 UTC - in response to Message 1335501. Woke up this morning to find both systems rapidly running out of work. Large backoffs due to not being able to connect to the Scheduler when it tries to do so. Grant Darwin NT ID: 1335513 ·

KWSN Ekky Ekky Ekky Send message Joined: 25 May 99 Posts: 944 Credit: 52,956,491 RAC: 67	Message 1335515 - Posted: 7 Feb 2013, 18:07:27 UTC - in response to Message 1335513. Woke up this morning to find both systems rapidly running out of work. Large backoffs due to not being able to connect to the Scheduler when it tries to do so. This is obviously the start of a 1920s Blues song. "Woke up this morning, Found I was nearly out of work. Woke up this morning, etc... ID: 1335515 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1335519 - Posted: 7 Feb 2013, 18:13:19 UTC Last modified: 7 Feb 2013, 18:13:40 UTC Can't report tasks, scheduler not responding. Anyone sees the same now ? SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1335519 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1335523 - Posted: 7 Feb 2013, 18:18:48 UTC - in response to Message 1335519. Can't report tasks, scheduler not responding. Anyone sees the same now ? Been extremely hard to contact the scheduler for days now..... "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1335523 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1335531 - Posted: 7 Feb 2013, 18:36:53 UTC - in response to Message 1335519. Last modified: 7 Feb 2013, 18:44:51 UTC Can't report tasks, scheduler not responding. Anyone sees the same now ? For about 3-4 days now. EDIT- and if you are able to contact the Scheduler & don't get a "Failure when receiving data from the peer message" it takes 2-5min to get a response. Grant Darwin NT ID: 1335531 ·

Keith White Send message Joined: 29 May 99 Posts: 392 Credit: 13,035,233 RAC: 22	Message 1335544 - Posted: 7 Feb 2013, 19:05:48 UTC Last modified: 7 Feb 2013, 19:44:55 UTC Now here is a possible cause that super crunchers won't likely realize. Because of the 100 unit max limit is less than what I would normally have based on queue size, the client is requesting more units every five minutes only to be (when I get through) rebuffed due to the 100 unit limit. Now super crunchers are likely to actually have completed units to report every five minutes but for me, maybe only 2-3 units per hour on average. So 9 or 10 pointless requests than are required are sent to the server. Now multiply that by all the hosts that are crunching as "slow" or slower than me (daily RAC ranks my host in the top 4,500-5,000 range) and that adds up to a lot of meaningless requests to the server. Could that be causing a problem? "Life is just nature's way of keeping meat fresh." - The Doctor ID: 1335544 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1335548 - Posted: 7 Feb 2013, 19:15:57 UTC - in response to Message 1335544. Could that be causing a problem? It wouldn't be helping with the bandwidth issues, but it wouldn't be causing a problem in it's own right. We've had Scheduler issues on & off for months now, but the current one began about 4 days ago. Prior to that, for a while at least, Scheduler repsonses were nice & quick. Now they take forever, if you can connect & if you don't get an error once you've done so. Grant Darwin NT ID: 1335548 ·

S@NL Etienne Dokkum Volunteer tester Send message Joined: 11 Jun 99 Posts: 212 Credit: 43,822,095 RAC: 0	Message 1335551 - Posted: 7 Feb 2013, 19:18:34 UTC sometime this afternoon the server "granted" my main rig with 200 WU's ... Soon to discover that all GPU were shorties... increasing the server load even more :S ID: 1335551 ·

ExchangeMan Volunteer tester Send message Joined: 9 Jan 00 Posts: 115 Credit: 157,719,104 RAC: 0	Message 1335558 - Posted: 7 Feb 2013, 19:41:36 UTC - in response to Message 1335551. sometime this afternoon the server "granted" my main rig with 200 WU's ... Soon to discover that all GPU were shorties... increasing the server load even more :S I hate it when I get tons of shorties. Too much bandwidth for too little credit. ID: 1335558 ·

BarryAZ Send message Joined: 1 Apr 01 Posts: 2580 Credit: 16,982,517 RAC: 0	Message 1335560 - Posted: 7 Feb 2013, 19:50:47 UTC - in response to Message 1335558. Given the evidence of the past many months, it seems that there is no solution envisioned or affordable for the clear bandwidth choke point, so perhaps instead some software filtering solution might be a reasonable approach to mitigating the obvious problem. Of course, there is another approach, which takes no effort at all, as people realize that the project reliability has in fact been seriously compromised due to traffic levels, they may do as Greeley suggested -- 'go elsewhere dear user' -- there are in fact many of other more reliable and interesting projects out there. sometime this afternoon the server "granted" my main rig with 200 WU's ... Soon to discover that all GPU were shorties... increasing the server load even more :S I hate it when I get tons of shorties. Too much bandwidth for too little credit. ID: 1335560 ·

Keith White Send message Joined: 29 May 99 Posts: 392 Credit: 13,035,233 RAC: 22	Message 1335561 - Posted: 7 Feb 2013, 19:51:14 UTC - in response to Message 1335548. Last modified: 7 Feb 2013, 19:51:54 UTC Could that be causing a problem? It wouldn't be helping with the bandwidth issues, but it wouldn't be causing a problem in it's own right. We've had Scheduler issues on & off for months now, but the current one began about 4 days ago. Prior to that, for a while at least, Scheduler repsonses were nice & quick. Now they take forever, if you can connect & if you don't get an error once you've done so. I know but it's acting a lot like a memory leak. Everything stays responsive right up to the point the application starts hitting the swap space hard and it becomes slow and unresponsive. Not saying it's a memory leak, just it's acting as if some buffer that isn't being emptied quite as fast as it is being filled finally overflows and it becomes a crapshot whether or not the next item actually gets into the buffer or lost. "Life is just nature's way of keeping meat fresh." - The Doctor ID: 1335561 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.