Magic Carpet RIde (Jul 19 2007)

Author	Message
Matt Lebofsky Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0	Message 606152 - Posted: 19 Jul 2007, 22:02:00 UTC Another day of minor tasks. Spent a chunk of the morning learning "parted" which I guess replaced "fdisk" for partitioning disks in the world of linux. Worked with Bob to figure out why recent science database dumps are failing and how to install the latest version of informix (for replica testing). Jeff and I started mapping our updated power requirements for the closet - we have a couple UPS's with red lights meaning we have some batteries to replace soon. Sometimes I feel about UPS's like I feel about all forms of insurance (car, house, health, etc.). Extra expense and effort up front to set up, regular expense and effort to maintain, and then when push comes to shove they don't save your butt nearly as well as you thought it would. In fact, a lot of the time it makes things worse. I had UPS's just up and die and take systems along with them. Likewise, I had two different insurance agencies on two separate occasions screw up their own paperwork thus nullifying my policies without my notification, wreaking havoc on my life in various unpredictable, unamusing ways. Okay I'm ranting here.. As for reasons stated earlier involving why our results to send queue went to zero a couple days ago, others have since suggested that, due to news of the impending power outage this weekend, many users have been flushing their caches to ensure they have enough work to withstand the predicted downtime. If this is indeed true, this could be seen as a distributed denial-of-service attack. But don't worry - I won't be calling the police. Played a gig last night for a giant Applied Materials party in San Francisco. I like the fact I get paid about four times the hourly rate performing songs like "Magic Carpet Ride" at these hyper-techie functions than I do actually managing the back-end network of the world's largest supercomputing project. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude ID: 606152 ·

Bernie Vine Volunteer moderator Volunteer tester Send message Joined: 26 May 99 Posts: 9954 Credit: 103,452,613 RAC: 328	Message 606210 - Posted: 19 Jul 2007, 23:34:33 UTC Last modified: 19 Jul 2007, 23:35:27 UTC Ahh UPS's my favorite topic recently - I feel the same as you. My company has sites all over the UK, I oversee 20 and recently had 2 UPS's failures that actually took the servers down, diagnostic suggested "internal UPS fault please contact..." Fine when they work, but otherwise a general pain. Thanks for the update. ID: 606210 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 606219 - Posted: 19 Jul 2007, 23:43:08 UTC - in response to Message 606210. Ahh UPS's my favorite topic recently - I feel the same as you. My company has sites all over the UK, I oversee 20 and recently had 2 UPS's failures that actually took the servers down, diagnostic suggested "internal UPS fault please contact..." Fine when they work, but otherwise a general pain. Thanks for the update. If you have a UPS, you should be able (and ready) to pull the plug at any time, and do so without fear. If you can't, you need to buy new UPSes. My servers draw about 500 watts. I have two 2200 VA. UPSes on an automatic transfer switch. As long as one has power, everything runs fine. I also test run-time (I can let one "run flat" and the transfer switch handles it). The problem is when they aren't tested routinely, you get surprises. -- Ned ID: 606219 ·

Dena Wiltsie Volunteer tester Send message Joined: 19 Apr 01 Posts: 1628 Credit: 24,230,968 RAC: 26	Message 606352 - Posted: 20 Jul 2007, 5:10:05 UTC Big problem with UPS's is that they use lead acid batterys. While they can last as long as 6 years, 4 years is pushing it. If up time is important, put the date the new batterys were installed on the outside of the unit and replace the batterys before they have time to fail. I work with APC and the only failure I have seen were due to batterys and an incorrectly wired outlet. ID: 606352 ·

Trueinnerpeace Send message Joined: 21 May 99 Posts: 8 Credit: 184,805 RAC: 0	Message 606377 - Posted: 20 Jul 2007, 8:04:38 UTC At the risk of overstating the obvious, preventative maintenance, like everything in life is key. Having been a former DEC field service engineer from late '70's I can tell you the PM's were a regular routine and in the intervening years I see nothing has changed other than having gotten smaller is all. Oh hum... ID: 606377 ·

Bernie Vine Volunteer moderator Volunteer tester Send message Joined: 26 May 99 Posts: 9954 Credit: 103,452,613 RAC: 328	Message 606592 - Posted: 20 Jul 2007, 18:32:50 UTC - in response to Message 606219. Ahh UPS's my favorite topic recently - I feel the same as you. My company has sites all over the UK, I oversee 20 and recently had 2 UPS's failures that actually took the servers down, diagnostic suggested "internal UPS fault please contact..." Fine when they work, but otherwise a general pain. Thanks for the update. If you have a UPS, you should be able (and ready) to pull the plug at any time, and do so without fear. If you can't, you need to buy new UPSes. My servers draw about 500 watts. I have two 2200 VA. UPSes on an automatic transfer switch. As long as one has power, everything runs fine. I also test run-time (I can let one "run flat" and the transfer switch handles it). The problem is when they aren't tested routinely, you get surprises. -- Ned Whilst I agree, unfortunately my company expanded very rapidly about 4 years ago and cost was an important consideration, so UPS's were just installed and "forgotten". The monitoring software wasn't even installed in most cases. Now of course we are suffering. Each site was installed with just one UPS and most run around 50-60% load, and until I instigated a program of installing the software and getting the UPS's to report failures, the first we knew of problems was when there was a power outage and the UPS immediately failed. Still were getting to grips with them now. Bernie ID: 606592 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 606594 - Posted: 20 Jul 2007, 18:39:05 UTC - in response to Message 606592. Ahh UPS's my favorite topic recently - I feel the same as you. My company has sites all over the UK, I oversee 20 and recently had 2 UPS's failures that actually took the servers down, diagnostic suggested "internal UPS fault please contact..." Fine when they work, but otherwise a general pain. Thanks for the update. If you have a UPS, you should be able (and ready) to pull the plug at any time, and do so without fear. If you can't, you need to buy new UPSes. My servers draw about 500 watts. I have two 2200 VA. UPSes on an automatic transfer switch. As long as one has power, everything runs fine. I also test run-time (I can let one "run flat" and the transfer switch handles it). The problem is when they aren't tested routinely, you get surprises. -- Ned Whilst I agree, unfortunately my company expanded very rapidly about 4 years ago and cost was an important consideration, so UPS's were just installed and "forgotten". The monitoring software wasn't even installed in most cases. Now of course we are suffering. Each site was installed with just one UPS and most run around 50-60% load, and until I instigated a program of installing the software and getting the UPS's to report failures, the first we knew of problems was when there was a power outage and the UPS immediately failed. Still were getting to grips with them now. Bernie I've not found the monitoring software to be that useful, frankly, which is why I just jerk the power plug and observe. What I do is switch the transfer switch to make one UPS "primary" and plug in a normal, electric clock. I set the clock for noon, and pull the plug. When the UPS batteries "go dry" the clock stops, and the transfer switch puts the load on the other UPS. One of my UPSes will do six hours under load. ... but that's not the normal setup. The normal setup is for the UPS to signal "mains down" and the server(s) then do a graceful shutdown and power off the UPS. That's what the software is for. For best battery life, you want to get off the UPS before the batteries get hot, and in most "factory configuration" UPSes, they will get hot pretty quick. -- Ned ID: 606594 ·

Logan Volunteer tester Send message Joined: 26 Jan 07 Posts: 743 Credit: 918,353 RAC: 0	Message 606624 - Posted: 20 Jul 2007, 19:59:27 UTC Last modified: 20 Jul 2007, 20:18:01 UTC Why do you think this things are named 'ups...!'...? (with a face of pannic from elseone administrator when that succeed...) Ha, ha, ha... Sorry for the joke... ID: 606624 ·

Arthur Clarke Send message Joined: 3 Apr 00 Posts: 1 Credit: 63,209 RAC: 0	Message 606845 - Posted: 21 Jul 2007, 5:02:19 UTC - in response to Message 606152. As for reasons stated earlier involving why our results to send queue went to zero a couple days ago, others have since suggested that, due to news of the impending power outage this weekend, many users have been flushing their caches to ensure they have enough work to withstand the predicted downtime. If this is indeed true, this could be seen as a distributed denial-of-service attack. But don't worry - I won't be calling the police. My preferences are set to maintain a stockpile of about 10 days of work to do. I'd been running version 5.8.16, and for the past week or so it had not been receiving new work units. Restarting it or reinstalling it didn't change the behavior. The results to send queue wasn't zero for all of that time. It was down to the last two workunits when I noticed that version 5.10.13 now was recommended. I downloaded and installed it. The next time it went to the well, it received a refill of about 340 hours of work (18 workunits). Did the mix of client versions requesting new work change significantly during the past week? ID: 606845 ·

Logan Volunteer tester Send message Joined: 26 Jan 07 Posts: 743 Credit: 918,353 RAC: 0	Message 606891 - Posted: 21 Jul 2007, 9:28:44 UTC - in response to Message 606845. Last modified: 21 Jul 2007, 9:33:28 UTC As for reasons stated earlier involving why our results to send queue went to zero a couple days ago, others have since suggested that, due to news of the impending power outage this weekend, many users have been flushing their caches to ensure they have enough work to withstand the predicted downtime. If this is indeed true, this could be seen as a distributed denial-of-service attack. But don't worry - I won't be calling the police. My preferences are set to maintain a stockpile of about 10 days of work to do. I'd been running version 5.8.16, and for the past week or so it had not been receiving new work units. Restarting it or reinstalling it didn't change the behavior. The results to send queue wasn't zero for all of that time. It was down to the last two workunits when I noticed that version 5.10.13 now was recommended. I downloaded and installed it. The next time it went to the well, it received a refill of about 340 hours of work (18 workunits). Did the mix of client versions requesting new work change significantly during the past week? Hi Clarke. The 5.8.16 BOINC manager version has not cache capability. You can set it to 10 days more, but it's futile... Only 5.10+ can use that. Logan. ID: 606891 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 606903 - Posted: 21 Jul 2007, 9:51:42 UTC - in response to Message 606891. As for reasons stated earlier involving why our results to send queue went to zero a couple days ago, others have since suggested that, due to news of the impending power outage this weekend, many users have been flushing their caches to ensure they have enough work to withstand the predicted downtime. If this is indeed true, this could be seen as a distributed denial-of-service attack. But don't worry - I won't be calling the police. My preferences are set to maintain a stockpile of about 10 days of work to do. I'd been running version 5.8.16, and for the past week or so it had not been receiving new work units. Restarting it or reinstalling it didn't change the behavior. The results to send queue wasn't zero for all of that time. It was down to the last two workunits when I noticed that version 5.10.13 now was recommended. I downloaded and installed it. The next time it went to the well, it received a refill of about 340 hours of work (18 workunits). Did the mix of client versions requesting new work change significantly during the past week? Hi Clarke. The 5.8.16 BOINC manager version has not cache capability. You can set it to 10 days more, but it's futile... Only 5.10+ can use that. Logan. Not true, Boinc 5.8.16 Will Cache up to 10 days work, It's just on you General Preferences page you have to put the figure in the first box, the one that says: Computer is connected to the Internet about every (Leave blank or 0 if always connected. BOINC will try to maintain at least this much work.) Claggy ID: 606903 ·

Logan Volunteer tester Send message Joined: 26 Jan 07 Posts: 743 Credit: 918,353 RAC: 0	Message 606912 - Posted: 21 Jul 2007, 10:09:29 UTC - in response to Message 606903. As for reasons stated earlier involving why our results to send queue went to zero a couple days ago, others have since suggested that, due to news of the impending power outage this weekend, many users have been flushing their caches to ensure they have enough work to withstand the predicted downtime. If this is indeed true, this could be seen as a distributed denial-of-service attack. But don't worry - I won't be calling the police. My preferences are set to maintain a stockpile of about 10 days of work to do. I'd been running version 5.8.16, and for the past week or so it had not been receiving new work units. Restarting it or reinstalling it didn't change the behavior. The results to send queue wasn't zero for all of that time. It was down to the last two workunits when I noticed that version 5.10.13 now was recommended. I downloaded and installed it. The next time it went to the well, it received a refill of about 340 hours of work (18 workunits). Did the mix of client versions requesting new work change significantly during the past week? Hi Clarke. The 5.8.16 BOINC manager version has not cache capability. You can set it to 10 days more, but it's futile... Only 5.10+ can use that. Logan. Not true, Boinc 5.8.16 Will Cache up to 10 days work, It's just on you General Preferences page you have to put the figure in the first box, the one that says: Computer is connected to the Internet about every (Leave blank or 0 if always connected. BOINC will try to maintain at least this much work.) Claggy And the preferences says 'Maintain enough work for an additional'... n days '(Requires 5.10+ client.) ' But if you dont have 5.10.7 or 5.10.13 (by ex.), set this parameter to 10 days is futile... Regards Claggy. Logan. ID: 606912 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 607120 - Posted: 21 Jul 2007, 16:26:04 UTC - in response to Message 606891. Hi Clarke. The 5.8.16 BOINC manager version has not cache capability. You can set it to 10 days more, but it's futile... Only 5.10+ can use that. Logan. This is incorrect. There are more cache settings in 5.10+ than 5.8.16, but BOINC has been caching work well back into the 4.x versions. ID: 607120 ·

OzzFan Volunteer tester Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28	Message 607204 - Posted: 21 Jul 2007, 19:15:54 UTC Please keep in mind that with short deadlines and a large cache, BOINC will go into EDF ("panic") mode, thinking it will not be able to finish all the work it just downloaded. When this happens, BOINC will automatically cut off any more downloads until it gets the cache down to a reasonable level. ID: 607204 ·

Dr. C.E.T.I. Send message Joined: 29 Feb 00 Posts: 16019 Credit: 794,685 RAC: 0	Message 607221 - Posted: 21 Jul 2007, 19:34:26 UTC i got 271 units automatically (weird) awhile ago . . . didn't ask - just fot 'em . . . guess that'll do for the outage (HP Laptop dv9060us Intel Dual Core 2). . . ID: 607221 ·

Logan Volunteer tester Send message Joined: 26 Jan 07 Posts: 743 Credit: 918,353 RAC: 0	Message 607436 - Posted: 22 Jul 2007, 20:32:47 UTC - in response to Message 607120. Hi Clarke. The 5.8.16 BOINC manager version has not cache capability. You can set it to 10 days more, but it's futile... Only 5.10+ can use that. Logan. This is incorrect. There are more cache settings in 5.10+ than 5.8.16, but BOINC has been caching work well back into the 4.x versions. Try to use the 5.8.16 version for windows and after that, tell me how fine your cache works... ha, ha, ha... Logan. BOINC FAQ Service (Ahora, tambiÃƒÂ©n disponible en EspaÃƒÂ±ol/Now available in Spanish) ID: 607436 ·

Greg Niehues Send message Joined: 29 Oct 06 Posts: 3 Credit: 576,026 RAC: 0	Message 607469 - Posted: 22 Jul 2007, 21:26:32 UTC - in response to Message 607436. Hi Clarke. The 5.8.16 BOINC manager version has not cache capability. You can set it to 10 days more, but it's futile... Only 5.10+ can use that. Logan. This is incorrect. There are more cache settings in 5.10+ than 5.8.16, but BOINC has been caching work well back into the 4.x versions. Try to use the 5.8.16 version for windows and after that, tell me how fine your cache works... ha, ha, ha... I'm using it - and caching with it. Works fine for me. ha, ha, ha..... ID: 607469 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 607502 - Posted: 22 Jul 2007, 23:05:59 UTC - in response to Message 607436. Hi Clarke. The 5.8.16 BOINC manager version has not cache capability. You can set it to 10 days more, but it's futile... Only 5.10+ can use that. Logan. This is incorrect. There are more cache settings in 5.10+ than 5.8.16, but BOINC has been caching work well back into the 4.x versions. Try to use the 5.8.16 version for windows and after that, tell me how fine your cache works... ha, ha, ha... Actually, as a tester I've run most versions, so I know -- and I've spent a lot of time experimenting with how BOINC handles the various parameters. If you set "cache additional days" to 0, 5.10+ and 5.8.16 work exactly the same way, and the "connect every 'x' days" is the cache setting. If you tell BOINC "connect every 3 days" it will try to carry enough work so that it does not run out in less than 3 days, and doesn't miss deadlines at 6. Because the setting is indirect (you aren't saying "cache 3 days" you're setting the interval) it won't do exactly three days. If you set "connect every 10 days" BOINC will have trouble if you have work units with short deadlines -- and that happens alot. A 10 day interval will not cache 10 days of work because BOINC knows that if it waits 10 days that work will be late. But, you're relatively new, and this has been discussed ad-nausiam. Every version of BOINC, going back to the first public release, has had caching. Most versions have worked as designed -- and the arguments have always been over the design, not the implementation. ID: 607502 ·

Uioped1 Volunteer tester Send message Joined: 17 Sep 03 Posts: 50 Credit: 1,179,926 RAC: 0	Message 607907 - Posted: 23 Jul 2007, 23:06:11 UTC - in response to Message 606352. Big problem with UPS's is that they use lead acid batterys. While they can last as long as 6 years, 4 years is pushing it. If up time is important, put the date the new batterys were installed on the outside of the unit and replace the batterys before they have time to fail. I work with APC and the only failure I have seen were due to batterys and an incorrectly wired outlet. There are intriguing possibilities in flywheel-based UPSes I think only a couple of companies have brought products to the market for datacenters, but it's a very nice alternative to batteries. ID: 607907 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 607991 - Posted: 24 Jul 2007, 5:08:17 UTC - in response to Message 607907. Last modified: 24 Jul 2007, 5:08:44 UTC There are intriguing possibilities in flywheel-based UPSes Converting electrical energy to mechanical energy & back to electrical energy generally isn't as efficient as electircal-chemical-electrical. Grant Darwin NT ID: 607991 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.