Hemiola (Aug 12 2010)

Author	Message
W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19062 Credit: 40,757,560 RAC: 67	Message 1024918 - Posted: 14 Aug 2010, 7:12:53 UTC - in response to Message 1024574. snipped....... AS a (mostly) mechanical engineer, it does my heart good to see my electronic colleagues adapting the time honoured and tested ways of the mech eng. We electronic engineers have always used the drop or kick test as a first line tool for faulty equipment. The distinction of being a good, no so good or lousy engineer depends on your skill and experience at delivering the require accurate shock. ID: 1024918 ·

Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489	Message 1024930 - Posted: 14 Aug 2010, 8:07:34 UTC - in response to Message 1024574. Matt.. if I might share.. from my experiences of keeping together antiques that were often poorly "refurbished".. many problems clear permanently upon "re-seating" unplugging, and plugging back in. Other times taking things out, some surprise drops loose(seen or unseen 50/50).. and are then magically "fixed". Whether they were dirty connections, a bit of dust, someones raisinette.. does not really matter as long as they clear. a bad connection invisible to the eye might nearly need "bumped".. and could be gone forever. We came up with things such as "pencil test".. where while monitoring the signal we tapped the outside case and see if it had effects. And some of the equipment was old enough to even contain mercury relays, where the mercury would vaporize, re-solidify in obscure pieces, and refuse to work until we "bounced" (hold edge of component 3-4" above anti-static surface, drop and catch on first bounce, re-insert) to clear. These are also good reasons why "fault tolerance" is a good(although expensive) principle. On the reports going back.. all of these were jotted down as "re-seat to clear." Because if we told the truth, the whole truth, and nothing but the truth... it would have been the Salem Witch trials all over again. One thing that used to work on CRT terminals, back in the '80s, was to give them a "slap upside the screen". Some terminals would come back to life for a time after the slap. Location (and force) was brand-dependent, and with one of the brands, there were two methods that worked, depending on symptom: the slap, directed at the upper right of the CRT, and lifting the front of the CRT about an inch, and dropping. IBM 3278's were pretty reliable, but when they went, they could (sometimes...) be brought back by slapping the back right corner, and picking up the back about .5 inch, and dropping... AS a (mostly) mechanical engineer, it does my heart good to see my electronic colleagues adapting the time honoured and tested ways of the mech eng. And to think that some people think that I'm daft. LOL Maybe in the future we can just swear at the particular part/item without getting physical but I doubt things will ever get that good. :D ID: 1024930 ·

soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0	Message 1024969 - Posted: 14 Aug 2010, 12:12:06 UTC The deciding factor for a technician to be considered a genius or an idiot.. Close the data room door. If they see how you fixed things.. you are an idiot. If they do not, you are magical! Janice ID: 1024969 ·

ToxicTBag Send message Joined: 5 Feb 10 Posts: 101 Credit: 57,197,902 RAC: 0	Message 1025000 - Posted: 14 Aug 2010, 13:47:48 UTC Agrees with soft^spirit lol. ID: 1025000 ·

KWSN THE Holy Hand Grenade! Volunteer tester Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0	Message 1025037 - Posted: 14 Aug 2010, 15:52:33 UTC - in response to Message 1024930. Last modified: 14 Aug 2010, 15:53:46 UTC Matt.. if I might share.. from my experiences of keeping together antiques that were often poorly "refurbished".. many problems clear permanently upon "re-seating" unplugging, and plugging back in. Other times taking things out, some surprise drops loose(seen or unseen 50/50).. and are then magically "fixed". Whether they were dirty connections, a bit of dust, someones raisinette.. does not really matter as long as they clear. a bad connection invisible to the eye might nearly need "bumped".. and could be gone forever. We came up with things such as "pencil test".. where while monitoring the signal we tapped the outside case and see if it had effects. And some of the equipment was old enough to even contain mercury relays, where the mercury would vaporize, re-solidify in obscure pieces, and refuse to work until we "bounced" (hold edge of component 3-4" above anti-static surface, drop and catch on first bounce, re-insert) to clear. These are also good reasons why "fault tolerance" is a good(although expensive) principle. On the reports going back.. all of these were jotted down as "re-seat to clear." Because if we told the truth, the whole truth, and nothing but the truth... it would have been the Salem Witch trials all over again. One thing that used to work on CRT terminals, back in the '80s, was to give them a "slap upside the screen". Some terminals would come back to life for a time after the slap. Location (and force) was brand-dependent, and with one of the brands, there were two methods that worked, depending on symptom: the slap, directed at the upper right of the CRT, and lifting the front of the CRT about an inch, and dropping. IBM 3278's were pretty reliable, but when they went, they could (sometimes...) be brought back by slapping the back right corner, and picking up the back about .5 inch, and dropping... AS a (mostly) mechanical engineer, it does my heart good to see my electronic colleagues adapting the time honoured and tested ways of the mech eng. And to think that some people think that I'm daft. LOL Maybe in the future we can just swear at the particular part/item without getting physical but I doubt things will ever get that good. :D It's not that we were angry at the terminal (this was back in the days of mainframes...) it is because the technique worked! The slap or drop reset contacts inside the terminal, (not gold-plated, for some reason...) which we (I and a collegue were the primary terminal fixers in the IT department - it was a secondary job for me, I was primarily a computer operator...) didn't have the knowledge to dis-assemble. (we'd have to call an outside contractor for that...) . Hello, from Albany, CA!... ID: 1025037 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1025068 - Posted: 14 Aug 2010, 18:12:27 UTC 'server run, August 13-16 2010' Because of the upper mentioned thread is closed, I write here. Since ~ 24 hours the cricket graph show maxed out traffic. So the mentioned 'WU limit in progress' isn't active. My BOINC could DL the adjusted WU cache. To now no server crash. So why would be the limit needed? The new donated server can manage the traffic better now? ID: 1025068 ·

perryjay Volunteer tester Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0	Message 1025105 - Posted: 14 Aug 2010, 20:23:44 UTC - in response to Message 1025099. When I was repairing military radios we called that the two foot drop test.( They were a lot sturdier than civilian gear so we had to drop them farther.) That or they accidentally fell off the test bench! :-) PROUD MEMBER OF Team Starfire World BOINC ID: 1025105 ·

Gary Charpentier Volunteer tester Send message Joined: 25 Dec 00 Posts: 30651 Credit: 53,134,872 RAC: 32	Message 1025137 - Posted: 14 Aug 2010, 21:53:18 UTC - in response to Message 1025068. Last modified: 14 Aug 2010, 21:53:32 UTC The new donated server can manage the traffic better now? Are you talking about the server that they still need to write the purchase order for? ID: 1025137 ·

Terror Australis Volunteer tester Send message Joined: 14 Feb 04 Posts: 1817 Credit: 262,693,308 RAC: 44	Message 1025243 - Posted: 15 Aug 2010, 7:56:11 UTC - in response to Message 1025099. LOL! There ain't much that can't be fixed without the judicious application of a "Birmingham Screwdriver" :-) There are a number of faults in my company's system that list "percussive maintenance" as the fix :-) The Terror P.S. "Planters", "Bedlam" or "Just Right" (A Breakfasr Cereal in Oz that contains the afore mentioned fruits, nuts and flakes) all sound good to me :-) ID: 1025243 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1025258 - Posted: 15 Aug 2010, 9:20:39 UTC - in response to Message 1025137. Last modified: 15 Aug 2010, 9:23:27 UTC The new donated server can manage the traffic better now? Are you talking about the server that they still need to write the purchase order for? Ohh.. I thought the new donated server is already in the SETI@home lab. So then, I'm much more curious why we didn't saw/don't have a server crash. ;-) ID: 1025258 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1025259 - Posted: 15 Aug 2010, 9:28:00 UTC - in response to Message 1025258. Last modified: 15 Aug 2010, 9:29:50 UTC ~ 28 1/2 hours maxed out traffic. The cricket graph isn't longer maxed out. So all PCs out there DLed their adjusted WU cache. So I'm curious why we had/needed 'WU limit in progress'. IIRC, it was for to protect server crashes. But now, it worked without a limit. The server are now more stable because of the last 3 day outage? If, what the SETI@home crew did, what made the server more stable? ID: 1025259 ·

Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20	Message 1025310 - Posted: 15 Aug 2010, 15:15:11 UTC - in response to Message 1024344. @Sutaru So I'm curious why we had/needed 'WU limit in progress'. IIRC, it was for to protect server crashes. But now, it worked without a limit. The server are now more stable because of the last 3 day outage? If, what the SETI@home crew did, what made the server more stable? In the first post of this thread, Matt said That's all good, but the week in general has been tainted by mork issues in general. It had one of its regular mystery crashes on Tuesday (followed by a long recovery). Then last night, and again this morning, the RAID mirror of two solid state drives (where we keep the innodb logs) started going flakey on us. The partition would just disappear, sending mysql into fits. We were able to quickly recover, but we're abandoning the solid state drives for now. Honestly, they weren't adding all that much to the i/o picture because we were cautious about how we were implementing them. Now I'm glad we were cautious. The upshot of all the above meant that we had to recovery the replica as many as four times so far from the weekly backup. What a pain. The latest replica recovery is happening as I type this. All I hope is that all systems are normal and stable by tomorrow. Maybe that's part of it. If Mork is more stable without the flaky solid-state drives, the whole system is more stable. I'm also seeing fewer problems with goofy estimated completion times. That affects work fetch and cache filling. Maybe the server-side changes made a couple weeks ago are finally settling in. If so, maybe Jeff will feel comfortable raising the Friday-Saturday download limits next week. Donald Infernal Optimist / Submariner, retired ID: 1025310 ·

Dave Barstow Send message Joined: 14 May 99 Posts: 76 Credit: 15,064,044 RAC: 0	Message 1025339 - Posted: 15 Aug 2010, 17:02:18 UTC - in response to Message 1025310. Goofy WU Limits fixed??? I got two AP units a few hours ago and they 'think' they will require ~348 hrs when they have been completing in ~60 hrs for some time now. Hmmm??? 603 & 608 units seem to be timed about right. Guess it's just wait-and-see... ID: 1025339 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1025361 - Posted: 15 Aug 2010, 19:03:57 UTC - in response to Message 1025339. Goofy WU Limits fixed??? I got two AP units a few hours ago and they 'think' they will require ~348 hrs when they have been completing in ~60 hrs for some time now. Hmmm??? 603 & 608 units seem to be timed about right. Guess it's just wait-and-see... Limits and estimates are totally unrelated. The limits which Jeff intended to set would have controlled how many "in progress" tasks of each type a host could have. The difficulty with AP estimates stems from a delay in getting AP validators which interface to the new server-side estimate adjustments. That was fixed during the first week of August, but only AP tasks sent for validation since then can be used in the server average used for that adjustment. Then it takes ten such tasks before the servers will consider the average close enough to use. So you can expect at least the next 8 AP tasks to also be grossly overestimated. If that causes a problem in work fetch, a question in the Number Crunching forum would be the appropriate place for further discussion. Joe ID: 1025361 ·

Fred J. Verster Volunteer tester Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0	Message 1025378 - Posted: 15 Aug 2010, 20:35:46 UTC - in response to Message 1025361. And also in the SETI BÃªta N.C. Forums, same 'problem', occurs in the ATI GPU's, running OpenCL, CAL/BROOK or a mix of them. But latest revisions like 434, are quite fast and do drop computing time from 122 to 3 - 6 hours. But this is a bit 'of topic'. ID: 1025378 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1025664 - Posted: 16 Aug 2010, 19:59:02 UTC Last modified: 16 Aug 2010, 20:48:54 UTC Any chance we can have Seti Beta brought up, so we can upload/report our completed tasks before the Next outage? Claggy Edit: and in future bring it up a bit earlier?, please. ID: 1025664 ·

Sutaru Tsureku Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1025856 - Posted: 17 Aug 2010, 13:52:31 UTC Last modified: 17 Aug 2010, 13:58:58 UTC Validate errors! 'Message 1025850' Please disable the UL server/service. For to hold small the value of validate errors. The SETI@home crew will let run the famous script then again for to grant the Cr.? ID: 1025856 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.