Panic Mode On (77) Server Problems?

Author	Message
Terror Australis Volunteer tester Send message Joined: 14 Feb 04 Posts: 1817 Credit: 262,693,308 RAC: 44	Message 1301132 - Posted: 2 Nov 2012, 2:40:41 UTC - in response to Message 1301125. Last modified: 2 Nov 2012, 2:48:23 UTC If you have set NNT you don't have to wait the 5 minutes for another "hit". Interesting to know. I thought it was a universal 303 seconds between accepted contacts. Not that I ever have a huge pile of tasks to report since I'm CPU-only and AP. My guess is that the limit is applied in the Scheduler and that a straight report without a request for new work doesn't go through the Scheduler, so therefore the limit is not applied. T.A. ID: 1301132 ·

Brother Frank Send message Joined: 10 Dec 11 Posts: 26 Credit: 15,142,410 RAC: 0	Message 1301141 - Posted: 2 Nov 2012, 3:28:27 UTC - in response to Message 1301112. Dear Sutaru Tsureku, I did what you suggested and set my download to no new tasks and then waited five minutes or so. Then I "abused my Update button" by pressing it, but only once. I used to really hit the darned thing so often, but I'm getting more subtle and not "using that hammer" all the time. A few minutes later, all of my 50 or so built up completed tasks had been swallowed up. I moved from this desktop with one Nvidia Ti 550 graphics card to my new notebook and did the same thing. Again in just a few minutes my 40 or so completed tasks were swallowed. I went downstairs to my i7 2600k desktop with two Nvidia Ti 550 graphics cards and it had swallowed all its completed tasks -- probably 70-80 of them. I am feeling a great sense of relief like you have helped me find a wonderful brand of computer laxative. Wow, Thanks for reminding me of this wonderful inexpensive procedure. I am breaking out in song,"Oh What A Relief IT IS -- la la la la la.... la la. Hurray." I hope this is not too racy. It's in the same fine tradition found in the writings of the esteemed 16th Century author Francois Rabelais, a scholar and humorist who penned the tales of Gargantua and Pantagruel. French literature including Rabelais and Michel de Montaigne was one of my very favorite college courses. Brother Frank ID: 1301141 ·

Sirius B Volunteer tester Send message Joined: 26 Dec 00 Posts: 24879 Credit: 3,081,182 RAC: 7	Message 1301144 - Posted: 2 Nov 2012, 3:58:19 UTC This is a first for me. After working on my Win 8 host, allowed it to download tasks. After several timeouts, found that I had downloaded 16 wu's. However, on looking at My computers page, found that I had 119 wu's, but 103 marked "abandoned"? ID: 1301144 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304	Message 1301148 - Posted: 2 Nov 2012, 4:17:39 UTC - in response to Message 1301144. While at work today not once were either of my systems able to report any work. Scheduler request failed: Timeout was reached is the only response they get. And naturally their caches continue to shrink. Grant Darwin NT ID: 1301148 ·

Khangollo Send message Joined: 1 Aug 00 Posts: 245 Credit: 36,410,524 RAC: 0	Message 1301151 - Posted: 2 Nov 2012, 4:41:23 UTC - in response to Message 1301148. Last modified: 2 Nov 2012, 4:41:57 UTC After several timeouts, found that I had downloaded 16 wu's. However, on looking at My computers page, found that I had 119 wu's, but 103 marked "abandoned"? You're lucky. My computer got entire 5-day cache of 692 tasks (including some of new APs) abandoned today for absolutely no reason. The worst of it is, none of the workunits were deleted locally, so if I hadn't aborted them, I'd be doing 5 days of pointless work. ID: 1301151 ·

Keith White Send message Joined: 29 May 99 Posts: 392 Credit: 13,035,233 RAC: 22	Message 1301188 - Posted: 2 Nov 2012, 6:23:16 UTC As someone pointed out, and for me, it appears that the workunits we've uploaded are being processed and credited, just the notification of that to clear all the workunits listed as "ready to report" off of the task tab as well as fetching any new workunits. Of course you know what this means. When the blockage is finally fixed we will be getting a metric ton of new workunits all with the same timestamp and likely paired with the same wingman or two. And if history is any indication there is a high likelihood that the wingman will be someone whose a) cuda application always overflow thus leading to a inconclusive validation; b) has a 29.9 day turn around time; or C) forgot they signed up for Seti@Home and all the tasks timeout after six or so weeks. No, I'm not cynical. Not at all. :p "Life is just nature's way of keeping meat fresh." - The Doctor ID: 1301188 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1301197 - Posted: 2 Nov 2012, 6:58:45 UTC Last modified: 2 Nov 2012, 6:59:08 UTC All I can add here is right now there is a LOT of bandwidth simply being wasted. When I watch my 9 rigs try over and over again to report completed tasks, only to have the attempts go awry after sending the data........ All that bandwidth has been wasted, further clogging the pipes with comms that accomplish nothing. I can only hope that the new up/download servers and other improvements the GPUUG is working on may somewhat mitigate this disaster. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1301197 ·

Uli Volunteer tester Send message Joined: 6 Feb 00 Posts: 10923 Credit: 5,996,015 RAC: 1	Message 1301202 - Posted: 2 Nov 2012, 7:09:27 UTC I have been monitoring my rig and have no issues. Except Mark is partnered with me on of my inconclusives. I sure it is stuck in the upload problem. Pluto will always be a planet to me. Seti Ambassador Not to late to order an Anni Shirt ID: 1301202 ·

musicplayer Send message Joined: 17 May 10 Posts: 2430 Credit: 926,046 RAC: 0	Message 1301210 - Posted: 2 Nov 2012, 8:08:17 UTC Last modified: 2 Nov 2012, 8:10:00 UTC What about the idea of perhaps assigning so-called "shorties" tasks to the slower hosts? Most of these "shorties" probably do not return results of any particular significance. Therefore the more powerful hosts (computers) could be assigned to be doing the longer tasks which are carrying out the gaussian search as well as possibly the AstroPulse (AP) tasks. ID: 1301210 ·

Uli Volunteer tester Send message Joined: 6 Feb 00 Posts: 10923 Credit: 5,996,015 RAC: 1	Message 1301213 - Posted: 2 Nov 2012, 8:17:12 UTC Last modified: 2 Nov 2012, 8:19:15 UTC MP all our science counts. Your facinasion with Guassions astonds me. When in time we will find the signal, everyone will win. Pluto will always be a planet to me. Seti Ambassador Not to late to order an Anni Shirt ID: 1301213 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1301282 - Posted: 2 Nov 2012, 15:45:44 UTC Mark, I noticed something last night that I think you're talking about. I reported one completed AP and the website showed it as reported, but BOINC didn't get the memo and instead.. "timeout was reached." The next report cleared it up, but every time you talk to the scheduler, your client_state has to be sent. I imagine for the faster rigs, it is probably in the >1MB range, so I agree.. wasted bandwidth for scheduler time-outs. Kind of like cramming 16MB of data through the pipe only to have it be 100% blanked. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1301282 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1301288 - Posted: 2 Nov 2012, 15:54:01 UTC - in response to Message 1301282. Last modified: 2 Nov 2012, 15:54:59 UTC Mark, I noticed something last night that I think you're talking about. I reported one completed AP and the website showed it as reported, but BOINC didn't get the memo and instead.. "timeout was reached." The next report cleared it up, but every time you talk to the scheduler, your client_state has to be sent. I imagine for the faster rigs, it is probably in the >1MB range, so I agree.. wasted bandwidth for scheduler time-outs. Kind of like cramming 16MB of data through the pipe only to have it be 100% blanked. Yeah.... All the data gets sent through the pipe, and then the server fumbles the ball......more data then transferred trying to determine who recovered the fumble. Uhh...back to 1st down and goal to go. More bandwidth consumed on the next attempt. 4th down......how many times are they gonna try? Oh crap, they missed the field goal. 1st down.......more bandwidth consumed and we still have not scored. You get my drift. When things get tangled like this, it's a downward spiral. More bandwidth gets used trying to recover fumbles than moving the dang ball. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1301288 ·

cov_route Send message Joined: 13 Sep 12 Posts: 342 Credit: 10,270,618 RAC: 0	Message 1301293 - Posted: 2 Nov 2012, 16:01:11 UTC Now I have over 3000 phantom wu's up from about 2k last night. This is about 5x my normal caches size. Should I set NNT? Or just let it do its thing? ID: 1301293 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65738 Credit: 55,293,173 RAC: 49	Message 1301297 - Posted: 2 Nov 2012, 16:08:36 UTC - in response to Message 1301202. I have been monitoring my rig and have no issues. Except Mark is partnered with me on of my inconclusives. I sure it is stuck in the upload problem. I'm having less trouble uploading than reporting, something needs some computer exlax for the reporting... The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1301297 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1301299 - Posted: 2 Nov 2012, 16:10:48 UTC - in response to Message 1301293. Now I have over 3000 phantom wu's up from about 2k last night. This is about 5x my normal caches size. Should I set NNT? Or just let it do its thing? The kitties are just letting Boinc do it's thing. The only change I made a while back when the scheduler was tied in knots was add a bit to my cc_config file to report only 100 WUs at a time. That helped at the time, but it is not improving things much right now. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1301299 ·

Link Send message Joined: 18 Sep 03 Posts: 834 Credit: 1,807,369 RAC: 0	Message 1301305 - Posted: 2 Nov 2012, 16:24:57 UTC - in response to Message 1301282. (...) but every time you talk to the scheduler, your client_state has to be sent. I imagine for the faster rigs, it is probably in the >1MB range, so I agree.. wasted bandwidth for scheduler time-outs. sched_request_setiathome.berkeley.edu.xml is send to the scheduler and should be quite a bit smaller than the client_state.xml since it doesn't contain all the information about all files, other projects and only very sparse information about all SETI tasks on that machine. ID: 1301305 ·

Cherokee150 Send message Joined: 11 Nov 99 Posts: 192 Credit: 58,513,758 RAC: 74	Message 1301320 - Posted: 2 Nov 2012, 17:09:22 UTC - in response to Message 1301299. Mark, I suspect, however, that the kitties are not purring right now. ;) By the way, I did do one thing to help alleviate the obviously gigantic mess developing between the SETI host and its clients. I suspended network activity on all my machines. I always keep my caches maxed out to handle these emergencies, so I'm good-to-go for at least a week. It'll be awhile before any of my units time out. When the dust settles and traffic is back to normal, I'll just let my rigs report their stored results one machine at a time. :) ID: 1301320 ·

Wibble Send message Joined: 25 Nov 02 Posts: 4 Credit: 1,168,325 RAC: 0	Message 1301354 - Posted: 2 Nov 2012, 18:29:08 UTC - in response to Message 1301112. Last modified: 2 Nov 2012, 18:30:50 UTC Maybe the last 20 hours or something no well scheduler contact. Always: Scheduler request failed: Timeout was reached new tasks was enabled. I set no new tasks - and then 178 uploaded tasks were accepted from the scheduler server in a bunch (successful report). That also worked for me, my last successful scheduler request was on the 30th, I just set 'no new tasks,' clicked 'update' and the request/report went through straight away. I think I'll leave 'no new tasks' set for a while. ID: 1301354 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304	Message 1301358 - Posted: 2 Nov 2012, 18:33:37 UTC - in response to Message 1301354. That also worked for me, my last successful scheduler request was on the 30th, I just set 'no new tasks,' clicked 'update' and the request/report went through straight away. I've tried it with NNT set, and without. I think the only advantage of NNT set is you can click update as soon as you get the timeout response from the Scheduler without it complaining about it being too soon. The fact is that overnight neither of my systems were able to get a reponse from the Scheduler that wasn't a timeout. Completed work to be reported piles up, and my caches get smaller & smaller. Grant Darwin NT ID: 1301358 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304	Message 1301392 - Posted: 2 Nov 2012, 20:18:24 UTC - in response to Message 1301358. I hope they can sort this Scheduler out soon. With all the shorties in the system, and the inabilty to get work on more than 1 attempt in 20, i'm going to run out of work very quickly when almost everything that does get downloaded will be done in minutes. Grant Darwin NT ID: 1301392 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.