Panic Mode On (19) Server problems

Author	Message
FiveHamlet Send message Joined: 5 Oct 99 Posts: 783 Credit: 32,638,578 RAC: 0	Message 914019 - Posted: 4 Jul 2009, 17:24:52 UTC I bet a selected few are able to d/load these WU's which maxes out the system, that leaves the rest of us unable to access. Then by the time we are able to up/download all the tasks have been taken leaving the cupboard bare. ID: 914019 ·

Geek@Play Volunteer tester Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0	Message 914020 - Posted: 4 Jul 2009, 17:36:29 UTC Do you see that big 18 hour plus gap in the bandwidth??? Cricket Graph This gap is where I performed my deception and blocked the bandwidth in order to prevent all the crunchers from discovering my home planet. Of course assistance required from those other aliens....you know them as "Grays". Their job was to suppress knowledge of this in the human race but they have failed. They did not discover the Cricket in time. Now arrangements are being made with politicians to cover this up along the same lines as with the Roswell Incident. Boinc....Boinc....Boinc....Boinc.... ID: 914020 ·

Matthew Love Volunteer tester Send message Joined: 26 Sep 99 Posts: 7763 Credit: 879,151 RAC: 0	Message 914024 - Posted: 4 Jul 2009, 17:45:35 UTC CHARLIE CHAN WILL SAVE THE DAY FROM THE TERRIBLE FOES!! LETS BEGIN IN 2010 ID: 914024 ·

Gundolf Jahn Send message Joined: 19 Sep 00 Posts: 3184 Credit: 446,358 RAC: 0	Message 914025 - Posted: 4 Jul 2009, 17:46:34 UTC - in response to Message 914016. Still seeing RED. By the way.....how can 69,000+ MultiBeam work units instantly show up ready for download? And then the bandwidth maxes out? Instantly? They didn't show up instantly. At 8:30 the result creation rate went to over thirty (instantly:-). The resulting download rush used up the bandwidth (almost instantly) and so the result count could increase over the next two hours. GruÃƒÅ¸, Gundolf Computer sind nicht alles im Leben. (Kleiner Scherz) SETI@home classic workunits 3,758 SETI@home classic CPU time 66,520 hours ID: 914025 ·

Geek@Play Volunteer tester Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0	Message 914027 - Posted: 4 Jul 2009, 17:53:59 UTC - in response to Message 914023. Do you see that big 18 hour plus gap in the bandwidth??? Cricket Graph This gap is where I performed my deception and blocked the bandwidth in order to prevent all the crunchers from discovering my home planet. Of course assistance required from those other aliens....you know them as "Grays". Their job was to suppress knowledge of this in the human race but they have failed. They did not discover the Cricket in time. Now arrangements are being made with politicians to cover this up along the same lines as with the Roswell Incident. Did you really get permission from our home planet, to reveal that? Oh my........this is rapidly falling apart. The knowledge of the gap is spreading around the planet and Obama is the only politician reachable on this holiday weekend. He does not have the necessary clearance and besides he would make a major tv broadcast about it and speak for several hours on it. But hey, maybe that would work.....put the entire planet to sleep with his speaking on and on and on........must do further research on this. Boinc....Boinc....Boinc....Boinc.... ID: 914027 ·

BarryAZ Send message Joined: 1 Apr 01 Posts: 2580 Credit: 16,982,517 RAC: 0	Message 914030 - Posted: 4 Jul 2009, 18:04:15 UTC - in response to Message 914015. Last modified: 4 Jul 2009, 18:07:14 UTC Look, we know the litany, the SETI project is doing a lot with relatively resources. (I say relatively limited resources because there are dozens of other projects with far fewer resources than SETI). Given that 'Give me more power (or money or resources) lament of Captain user to Engineer SETI project lament is a constant, and one never to be fulfilled (more resources begets the need for more resources), it seems to me that folks really ought to be exploring one of the actual good things about the BOINC project -- project diversity. Add more projects, tamp down on the share of your CPU (and/or GPU) cycles that is allocated to SETI and balance the load. SETI currently has four times the users as the next largest batch of BOINC projects (Rosetta, Climate, World Grid, and Einstein), and after that, the user count drops way off to much smaller projects; which by the way are operating pretty reliably on an much smaller resource budget in part because users are drawn, moths to the flame, to SETI. These other projects very often are doing quite serious science as well. If you are running CUDA devices, consider GPUGrid for example, or if you have unused fast ATI GPU resources, consider MilkyWay which currently provides the only optimized application which supports ATI GPU. If you prefer long run work units, consider Climate, for mid length work units, Einstein works fine, and for shorter run work, there are a host of other projects which are generally running more reliably than SETI. I mean, let's face it, the 'work' being done here is probably best characterized as speculative science, this isn't a bad thing, but if there is a resource bottleneck (and there is), it simply makes a lot of sense to consider reducing frustration levels by supporting projects that are engaged in basic science research to a larger degree, thereby reducing the apparent permanent overload condition which exists here. Folks who simply stay with SETI and moan about its many issues (and should people elect to be honest about it, SETI is, for various reasons, among the least 'solid' of the BOINC projects), or folks who go into denial regarding those many issues and tout the 'give SETI more resources line' to the exclusion of alternatives, seem to me to rather miss the mark, and have missed it consistantly over the past few years. IMHO Ned - well spoken However it must be considered the frustration that ensues when one can't upload - download, or even report = such as is currently the case, probably until after next tuesday. ID: 914030 ·

HAL Send message Joined: 28 Mar 03 Posts: 704 Credit: 870,617 RAC: 0	Message 914032 - Posted: 4 Jul 2009, 18:15:40 UTC 56 more minutes and the first of my farm goes offline permanently - that's 2 WU's a day the supercrunchers won't have to compete for. The next one exits tomorrow morning. No wingmen will be left stranded! ID: 914032 ·

OzzFan Volunteer tester Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28	Message 914049 - Posted: 4 Jul 2009, 19:39:03 UTC - in response to Message 914015. IMHO Ned - well spoken However it must be considered the frustration that ensues when one can't upload - download, or even report = such as is currently the case, probably until after next tuesday. But your machines aren't frustrated - and there's no reason you should be either. You are not being deprived of anything other than worthless credits if your computers can't upload. There's no reason why any of us such be so invested into the machine side of this project emotionally. ID: 914049 ·

Matthew Love Volunteer tester Send message Joined: 26 Sep 99 Posts: 7763 Credit: 879,151 RAC: 0	Message 914054 - Posted: 4 Jul 2009, 19:53:17 UTC Did they have A major meltdown in servers? LETS BEGIN IN 2010 ID: 914054 ·

OzzFan Volunteer tester Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28	Message 914056 - Posted: 4 Jul 2009, 19:58:41 UTC - in response to Message 914054. Did they have A major meltdown in servers? Several issues have compounded the problem, but primarily it seems the major issue is the high traffic generated by clients requesting work more often due to shorter work sent out. Also, some new AP work was created after having a period of no AP, which are a bit larger than the normal workunits, and there's a lot of fast/hungry crunchers that are asking for that work as well. To add to that, it seems that every time the requests die down, the staff have their weekly server outage to perform their routine tasks, which means holding off the masses until they are back up. Then everyone comes rushing back in for more work and the cycle starts all over again. ID: 914056 ·

DPRGI - Luivul Send message Joined: 24 Jan 03 Posts: 17 Credit: 20,639,801 RAC: 0	Message 914062 - Posted: 4 Jul 2009, 20:22:29 UTC - in response to Message 914056. Did they have A major meltdown in servers? Several issues have compounded the problem, but primarily it seems the major issue is the high traffic generated by clients requesting work more often due to shorter work sent out. Also, some new AP work was created after having a period of no AP, which are a bit larger than the normal workunits, and there's a lot of fast/hungry crunchers that are asking for that work as well. To add to that, it seems that every time the requests die down, the staff have their weekly server outage to perform their routine tasks, which means holding off the masses until they are back up. Then everyone comes rushing back in for more work and the cycle starts all over again. Only a question why from 13:30 of Friday to the 8:30 of the Saturday the system work fine with a network traffic normal? Just for 20 Hours for me is very strange? ID: 914062 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304	Message 914069 - Posted: 4 Jul 2009, 20:41:13 UTC - in response to Message 914062. Last modified: 4 Jul 2009, 20:45:19 UTC Only a question why from 13:30 of Friday to the 8:30 of the Saturday the system work fine with a network traffic normal? Network traffic hasn't been normal for any length of time for over 2 weeks. The period of time you're mentioning the splitters weren't running properly & very little work was being produced. That's been fixed which is why we now have so much download traffic. Given the length of that problem period it will probably take well over 18 hours for this to clear. Grant Darwin NT ID: 914069 ·

Fred W Volunteer tester Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0	Message 914072 - Posted: 4 Jul 2009, 20:50:25 UTC OK? I'm having a bit of difficulty getting my head round what is going on with uploads/downloads here. Accepted - when download traffic is too high, upload traffic is strangled. Within an hour of that starting, my quaddie + GTX295 will stop asking for new work as the number of WU's waiting to upload is more than twice the number of CPUs, so I am no longer contributing to the download traffic. For the top hosts, with multiple GPUs, this will happen even quicker. The current download "spike" has lasted rather more than 4 hours. By my reckoning that should mean that all C2D and C2Q hosts (as well as anything faster) should have a backlog of uploads that is preventing work requests by now. So the download "spike" is being maintained by hosts that crunch relatively few WU's per day to fill a gap that was left by a a 9-hour lack of downloads? Something does not seem to add up... F. ID: 914072 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304	Message 914076 - Posted: 4 Jul 2009, 20:57:05 UTC - in response to Message 914072. Something does not seem to add up... Don't forget that the traffic previously died befoe all hosts had refilled their caches. So you've got those chaches that still weren't full, combined with all the work that was processed & not replaced while the splitters weren't splitting. And it looks like there's no more AP work being generated, so those hosts will now be getting more MB work than when there was AP work avaiable. Grant Darwin NT ID: 914076 ·

TCP JESUS Send message Joined: 19 Jan 03 Posts: 205 Credit: 1,248,845 RAC: 0	Message 914080 - Posted: 4 Jul 2009, 21:20:38 UTC What size of caches are the bigger 'crunchers' (with GPUs) running currently ? I have been running a 4-day cache and was lucky enough to make it through the last little 'hickup' the other day without running out of work (with the help of Reschedule 1.7).....but today is a different story. Looks like I will be 'idle' before the sun goes down if I can't upload some results. I had considered increasing my cache by a day or so, but left it alone to see what would happen after the last Network max-out. Unfortunately, I wasn't able to even come close to refilling HALF my cache, so regardless, a change made by me wouldn't have helped any at the time. Is a 4 day cache size common for a 'big rig' (Octo-OC'd-I7 w/twin OC'd GTX 260's) ? should I consider 5 or more days ? Thanks. Allan I am TCP JESUS...The Carpenter Phenom Jesus....and *HAMMERING* is what I do best! formerly known as...MC Hammer. ID: 914080 ·

Andy Williams Volunteer tester Send message Joined: 11 May 01 Posts: 187 Credit: 112,464,820 RAC: 0	Message 914084 - Posted: 4 Jul 2009, 21:26:37 UTC - in response to Message 914080. Is a 4 day cache size common for a 'big rig' (Octo-OC'd-I7 w/twin OC'd GTX 260's) ? should I consider 5 or more days ? I've been running with 3 days since January. I ran out a couple of times. Given the latest round of troubles, I bumped it to 5 days halfway through June. Haven't run dry. -- Classic 82353 WU / 400979 h ID: 914084 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 914087 - Posted: 4 Jul 2009, 21:33:49 UTC I got today ~ 250 WUs.. ~ 80 VLAR killed, ~ 170 'shorties' and 'normal' WUs. Ohh well.. My GPU cruncher do only idle at the day. Yes.. he need ~ 800 MB AR=0.44x WUs/day. Since some hours 14 result ULs which don't want to go to Berkeley. So no new work request.. but new work would be available? I'm really %$@&"Ã‚Â§$%@!$Ã¢â€šÂ¬%Ã‚Â§$% !! I think the best would be to switch OFF the GPU cruncher and sell. Then I would have a more relaxed life and wouldn't need to 'babysit' the PC. Sorry.. but.. since I have my GPU cruncher I have no fun with SETI@home. BTW. Will answer tomorrow the questions to my post in the last 'panic thread'. Now I'm tired. BTW. BOINC can only hold a ~ 3 day cache at my GPU cruncher. ~ 2,400 WUs. If I would go higher BOINC would go in crazy EDF mode and the PC can't crunch. Overloaded system RAM and CPU. It's now the second time in ~ 1 1/2 months that I need to switch OFF. ~ 250 W in idle mode is too much for to wait for new work. ID: 914087 ·

Fred W Volunteer tester Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0	Message 914091 - Posted: 4 Jul 2009, 21:45:46 UTC - in response to Message 914080. Is a 4 day cache size common for a 'big rig' (Octo-OC'd-I7 w/twin OC'd GTX 260's) ? should I consider 5 or more days ? Thanks. Allan Going to more than half the minimum turn-round time increases the chances of running into EDF so I have stuck at 3 days and have only run out of work after I deliberately ran down my cache to detach/re-attach after a failed upgrade to release about 1000 WU's back into the pool. The problem with EDF and CUDA (particularly if you have more than 1 GPU) is that multiple WU's can be put into "waiting to run" and this seems to increase the likelyhood of CPU fall-back being invoked. I would recommend staying below 3.5 days for the cache. F. ID: 914091 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 914093 - Posted: 4 Jul 2009, 21:52:53 UTC - in response to Message 914015. IMHO Ned - well spoken However it must be considered the frustration that ensues when one can't upload - download, or even report = such as is currently the case, probably until after next tuesday. Frustration, sure. Frustration because the high demand exceeding capacity. My frustration because BOINC has a tendency to beat on the servers, sure. In my opinion, most of this is caused by loading, and the thing that takes care of loading is time -- and weekdays and weekends are all the same, time passes even when Matt isn't at his desk. Frustration too because most people seem to compare SETI (which is not that time critical) to something like Amazon.com, where missed connections are lost revenue. Amazon needs multiply-redundant everything. BOINC projects don't. We just went through a longish period of failed uploads, and someone else commented that bandwidth is now pegged with downloads. BOINC is supposed to tolerate all of this, and as a general rule, it does pretty well. But "incompetent" is a little much. Those of us who hang out here know that Matt and Eric and Jeff, et. al., are both competent and dedicated, or we wouldn't see them on the weekends. ID: 914093 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 914095 - Posted: 4 Jul 2009, 21:56:30 UTC - in response to Message 914030. Look, we know the litany, the SETI project is doing a lot with relatively resources. (I say relatively limited resources because there are dozens of other projects with far fewer resources than SETI). Given that 'Give me more power (or money or resources) lament of Captain user to Engineer SETI project lament is a constant, and one never to be fulfilled (more resources begets the need for more resources), it seems to me that folks really ought to be exploring one of the actual good things about the BOINC project -- project diversity. Add more projects, tamp down on the share of your CPU (and/or GPU) cycles that is allocated to SETI and balance the load. SETI currently has four times the users as the next largest batch of BOINC projects (Rosetta, Climate, World Grid, and Einstein), and after that, the user count drops way off to much smaller projects; which by the way are operating pretty reliably on an much smaller resource budget in part because users are drawn, moths to the flame, to SETI. These other projects very often are doing quite serious science as well. If you are running CUDA devices, consider GPUGrid for example, or if you have unused fast ATI GPU resources, consider MilkyWay which currently provides the only optimized application which supports ATI GPU. If you prefer long run work units, consider Climate, for mid length work units, Einstein works fine, and for shorter run work, there are a host of other projects which are generally running more reliably than SETI. I mean, let's face it, the 'work' being done here is probably best characterized as speculative science, this isn't a bad thing, but if there is a resource bottleneck (and there is), it simply makes a lot of sense to consider reducing frustration levels by supporting projects that are engaged in basic science research to a larger degree, thereby reducing the apparent permanent overload condition which exists here. Folks who simply stay with SETI and moan about its many issues (and should people elect to be honest about it, SETI is, for various reasons, among the least 'solid' of the BOINC projects), or folks who go into denial regarding those many issues and tout the 'give SETI more resources line' to the exclusion of alternatives, seem to me to rather miss the mark, and have missed it consistantly over the past few years. Amen. ID: 914095 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.