Panic Mode On (78) Server Problems?

Author	Message
BarryAZ Send message Joined: 1 Apr 01 Posts: 2580 Credit: 16,982,517 RAC: 0	Message 1302930 - Posted: 7 Nov 2012, 0:39:36 UTC - in response to Message 1302921. Cleggy, I realize that -- but the problem which has persisted for a week or so seems unrelated to the 4 hour Tuesday outage. A number of people have reported it over the past week or more. Some are getting a tad frustrated. As for me, like I said, that's what other projects are for. Einstein, POEM and Malaria are getting a bit more of my CPU cycles for now. I'm sure the folks back at the ranch, once they figure out what the issue is with the scheduler specific to this problem (not the 'normal 4 hour outage 4 hour recovery Tuesday cycle'), will work toward resolving it and let folks know they've spotted it. ID: 1302930 ·

bill Send message Joined: 16 Jun 99 Posts: 861 Credit: 29,352,955 RAC: 0	Message 1302938 - Posted: 7 Nov 2012, 1:05:41 UTC - in response to Message 1302930. Cleggy, I realize that -- but the problem which has persisted for a week or so seems unrelated to the 4 hour Tuesday outage. A number of people have reported it over the past week or more. Some are getting a tad frustrated. As for me, like I said, that's what other projects are for. Einstein, POEM and Malaria are getting a bit more of my CPU cycles for now. (snipped) It doesn't hurt that other projects hand out credit at a higher rate than Seti does, if that sort of thing is important to you. ID: 1302938 ·

BarryAZ Send message Joined: 1 Apr 01 Posts: 2580 Credit: 16,982,517 RAC: 0	Message 1302941 - Posted: 7 Nov 2012, 1:17:22 UTC - in response to Message 1302938. Actually, those projects I noted are not all that credit generous - at least not for CPU tasks. POEM's GPU task credit is likely above that of SETI == but below that of many other GPU projects. Cleggy, I realize that -- but the problem which has persisted for a week or so seems unrelated to the 4 hour Tuesday outage. A number of people have reported it over the past week or more. Some are getting a tad frustrated. As for me, like I said, that's what other projects are for. Einstein, POEM and Malaria are getting a bit more of my CPU cycles for now. (snipped) It doesn't hurt that other projects hand out credit at a higher rate than Seti does, if that sort of thing is important to you. ID: 1302941 ·

Fred E. Volunteer tester Send message Joined: 22 Jul 99 Posts: 768 Credit: 24,140,697 RAC: 0	Message 1302946 - Posted: 7 Nov 2012, 1:35:14 UTC Last modified: 7 Nov 2012, 1:35:36 UTC I got through and reported the first 50 while on No New Tasks, but the following efforts failed. I can't get to my tasks page, and Replica is way behind. Going to be a while.... Hoping the new task limits have gone away. See comments in this thread before the maintenance window. Another Fred Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop. ID: 1302946 ·

bill Send message Joined: 16 Jun 99 Posts: 861 Credit: 29,352,955 RAC: 0	Message 1302950 - Posted: 7 Nov 2012, 2:07:28 UTC - in response to Message 1302941. Einstien's handing out 2000 credits about every 55 minutes on my dual GTX560Ti's at the moment. That's a bit better than Seti was doing. I'm still working on a 7 day cache of APs on the cpu. ID: 1302950 ·

bluestar Send message Joined: 5 Sep 12 Posts: 7034 Credit: 2,084,789 RAC: 3	Message 1302953 - Posted: 7 Nov 2012, 2:35:39 UTC Last modified: 7 Nov 2012, 2:40:31 UTC So you are thinking that you are observing or watching in the sky? Or at least trying to detect an intelligent from another civilization. Could they perhaps be moving a little bit around? Anyway, right now I have finished off my short tasks here a little while ago and now have 18 tasks that carry out the gaussian search. Therefore it takes a little more time to finish up these tasks. The question is then - are such tasks based on short tasks which has already been completed and now have been resent to users because they returned better numbers, possibly having a new task name or designation? Or does the tasks carrying out the gaussian search need a separate observation or re-observation on their own? Thanks for explaining this to me! ID: 1302953 ·

betreger Send message Joined: 29 Jun 99 Posts: 11362 Credit: 29,581,041 RAC: 66	Message 1302975 - Posted: 7 Nov 2012, 4:49:20 UTC Is a validater stuck? ID: 1302975 ·

Cherokee150 Send message Joined: 11 Nov 99 Posts: 192 Credit: 58,513,758 RAC: 74	Message 1302985 - Posted: 7 Nov 2012, 5:19:55 UTC Last modified: 7 Nov 2012, 5:26:18 UTC Those of us having trouble reporting tasks might find this helpful. I did a little experimenting this evening. I had about 150 tasks completed and waiting to report. Multiple attempts to upload with max_tasks_reported set to 0 (everything) failed as expected. Then I tried max_tasks_reported set to 10. It worked fine, also as expected. Next I tried max_tasks_reported set to 20. It also worked. Now I tried max_tasks_reported set to 40. Once again, success. Well, I thought, let's see where the upper limit is. So, I tried max_tasks_reported set to 110. It failed. Now I tried max_tasks_reported set to 100. IT WORKED! Although not 100% conclusive, it would appear that setting max_tasks_reported to anything from 10 to 100 should work when we are having these reporting troubles. I hope this is helpful. :) ID: 1302985 ·

Keith White Send message Joined: 29 May 99 Posts: 392 Credit: 13,035,233 RAC: 22	Message 1302988 - Posted: 7 Nov 2012, 5:32:35 UTC Well I'm getting scheduler timeouts again. I'll try connecting again tomorrow morning, should have a slew of units done by then since I just hit my massive shorty stack of units. "Life is just nature's way of keeping meat fresh." - The Doctor ID: 1302988 ·

Cherokee150 Send message Joined: 11 Nov 99 Posts: 192 Credit: 58,513,758 RAC: 74	Message 1302992 - Posted: 7 Nov 2012, 5:41:55 UTC - in response to Message 1302988. Last modified: 7 Nov 2012, 5:42:55 UTC Keith, Try setting your cc_config.xml file's <max_tasks_reported>0</max_tasks_reported> line to: <max_tasks_reported>100</max_tasks_reported> Then give the old manual Update button a push or two. No new tasks does -not- need to be on. I think you will find up to 100 of your completed tasks will report just fine. Of course, getting new tasks is an entirely different, and more difficult, problem. David :) ID: 1302992 ·

Tron Send message Joined: 16 Aug 09 Posts: 180 Credit: 2,250,468 RAC: 0	Message 1303001 - Posted: 7 Nov 2012, 6:05:20 UTC ok, now I'm getting a whole cache full of resends... I was not aware I had any lost tasks in the first place .. my cache was empty no ghosts. weird ID: 1303001 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304	Message 1303003 - Posted: 7 Nov 2012, 6:15:34 UTC - in response to Message 1302992. Last modified: 7 Nov 2012, 6:20:41 UTC Whatever they did during the outage, i wish they would undo it. Before, no matter how bad the netwrok traffic was, i was still able to download work even if it was very, very, very slowly. Now all that happens is the elapsed time starts counting, but nothing actually downloads. After about 4 minutes (for a couple of WUs) they eventually started to download. I've got the rest of them stitting there, 6 minutes & counting & not even a bit has transfered yet. Eventually they downloaded, although one of them took a couple of eventual timeouts & 12 minutes before it started to download. EDIT- i am getting less Scheduler timeouts & couldn't connect to servers than before, but there are still plenty of them. Probably half of the Scheduler requests are resulting in an error. Grant Darwin NT ID: 1303003 ·

S@NL Etienne Dokkum Volunteer tester Send message Joined: 11 Jun 99 Posts: 212 Credit: 43,822,095 RAC: 0	Message 1303010 - Posted: 7 Nov 2012, 6:43:27 UTC - in response to Message 1302992. Of course, getting new tasks is an entirely different, and more difficult, problem. David :) The max. number of tasks is set ridiculous low... I think I have about 2.5 days left and the schudeler still tells me I can't get new WÃš's ID: 1303010 ·

Keith White Send message Joined: 29 May 99 Posts: 392 Credit: 13,035,233 RAC: 22	Message 1303022 - Posted: 7 Nov 2012, 8:00:58 UTC Last modified: 7 Nov 2012, 8:02:57 UTC Thanks David but I'm seeing the same behavior as before. The scheduler is reporting failure yet the results I uploads have been validated, just my client doesn't know that. Also it appears I'm building yet another ghost army. It appears I don't have a cc_config.xml on my system already. I believe it goes in programdata\boinc (I'm running Win 7 64-bit) but it didn't seem to help (yes I shut the BOINC Manager down and restarted it). I'll paste the cc_config.xml below to get an opinion if I built it right. <cc_config> <options> <max_tasks_reported>100</max_tasks_reported> </options> </cc_config> I'm off to a late night gym excursion before I get socked in by the arrival of winter so I'll check tomorrow for a reply. Thanks ahead of time. "Life is just nature's way of keeping meat fresh." - The Doctor ID: 1303022 ·

Fred E. Volunteer tester Send message Joined: 22 Jul 99 Posts: 768 Credit: 24,140,697 RAC: 0	Message 1303061 - Posted: 7 Nov 2012, 12:07:28 UTC It appears I don't have a cc_config.xml on my system already. I believe it goes in programdata\boinc (I'm running Win 7 64-bit) but it didn't seem to help (yes I shut the BOINC Manager down and restarted it). I'll paste the cc_config.xml below to get an opinion if I built it right. <cc_config> <options> <max_tasks_reported>100</max_tasks_reported> </options> </cc_config> That looks okay - and yes, it belongs on the top level data dirctory, not one of the project directories or the program directory. Scheduler was modified a few weeks ago to accept a max of 64, so there's little point in using higher values unless you run other projects and need it there. Project is running inconsistently. Looking at my log for last 8 hours, I see failure to connect, timeouts, no tasks available, over the limit messages, and I got 18 cpu tasks overnight. I couldn't get any for 24 hours before that. I'm still below limit for cpu work and down to 99 cpu tasks for 6 cores. Could be a problem here, so I'm looking for corroboration that Scheduler is refusing you when you're certain that you are below the limit (50/cpu core and 400/gpu). Hard to tell with the variety of responses and connection issues. Another Fred Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop. ID: 1303061 ·

Khangollo Send message Joined: 1 Aug 00 Posts: 245 Credit: 36,410,524 RAC: 0	Message 1303066 - Posted: 7 Nov 2012, 12:29:15 UTC Limits now are NOT 50/cpu core, they are fixed and less than 200 WUs per HOST, BTW. So that was the fix. ID: 1303066 ·

Fred E. Volunteer tester Send message Joined: 22 Jul 99 Posts: 768 Credit: 24,140,697 RAC: 0	Message 1303069 - Posted: 7 Nov 2012, 12:45:20 UTC Limits now are NOT 50/cpu core, they are fixed and less than 200 WUs per HOST, BTW. So that was the fix. Okay, that would explain what we were discussing yesterday in this thread. I got a few more taking me to 100 cpu tasks, and then got the limits message on the next request - so maybe that's the number? Have you seen anything posted on this? Any idea on gpu limit? Another Fred Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop. ID: 1303069 ·

Paul Bowyer Volunteer tester Send message Joined: 15 Aug 99 Posts: 11 Credit: 137,603,890 RAC: 0	Message 1303071 - Posted: 7 Nov 2012, 12:52:39 UTC - in response to Message 1303069. Ever since the splitters were turned back on, I have been consistantly held to exactly 100 CPU and 100 GPU. ID: 1303071 ·

Ronald R CODNEY Send message Joined: 19 Nov 11 Posts: 87 Credit: 420,920 RAC: 0	Message 1303084 - Posted: 7 Nov 2012, 13:16:07 UTC W O W,,,A/P city. 7666 in the queue for all to be had. Maybe now, the bee-atching will turn to yelps of glee. ID: 1303084 ·

Cherokee150 Send message Joined: 11 Nov 99 Posts: 192 Credit: 58,513,758 RAC: 74	Message 1303135 - Posted: 7 Nov 2012, 14:51:45 UTC - in response to Message 1303061. Fred, It appears that the 64 unit reporting limit you were told has already been changed by the staff. If you look at my post http://setiathome.berkeley.edu/forum_thread.php?id=69890&postid=1302985 and Paul's post http://setiathome.berkeley.edu/forum_thread.php?id=69890&postid=1303071 it would appear that they have now set the limit to 100 CPU, 100 GPU and 100 reports, or, probably, 100 on everything. Sadly, this is not going to fix what is obviously a different problem, as the system has proven for a long time that it can handle far larger numbers of uploads, downloads, and caches per host. As most of us know, this whole problem has been recently introduced and will require a specific fix. Limiting our hosts only compounds the problem by forcing our hosts to hit the servers far more often. I wish Matt were back, as this is his expertise (I know this to be true from my extensive, personally conducted tour of SETI and visit with him not too long ago). He would be able to track down the problem faster and patch things up quicker. Furthermore, there is also just too much work for the remaining staff to tackle problems quickly, and we are beginning to see the results of this. Oh, for a government that would realize the importance of and would fund true scientific research, or, should I say, true scientific research that does not lead to more powerful weapons and surveillance technology! But I digress... If you only knew what I knew....... (that almost sounds like lyrics to a song, lol!) ID: 1303135 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.