Message boards :
Technical News :
Eggs (Mar 24 2008)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Things have been running rather well over the past couple of weeks. Having effectively unlimited bandwidth really helps. It's a little more hectic behind the scenes as new data keeps getting sent up from Arecibo - we are continually working to offload the data to our local servers (and remote mass storage) so we can send back the blank drives for more. Steps will be taken soon to improve this situation (namely: sending some data to our remote storage via our faster Hurricane connection). There was a bit of a panic this morning, however. Suddenly gowron, our workunit storage server, reset itself. Not only did it reboot, but it lost all host/IP information. For all we could tell at first it lost everything! We had to connect to it over serial (most difficult part: finding the right cables) but once we got in we found our 2 terabytes of workunits were still intact (whew). So it was mostly a matter of reconfiguring the basic things and we were back in business. Why did it reset itself? That remains a mystery. Another minor gripe: I spent a man/day last week working on testing mdadm's "spare group" feature. That is, if a drive fails on a RAID device without a spare, it can steal a spare from another RAID device in the same RAID group - mdadm's way of enabling a "hot spare pool." We never had a case where this would happen, nor did we ever test it. Now that thumper is less two spares (due to making a new small, separate RAID1 for database indexes) I wanted to test this. I made simple test cases and failed drives - but the available spares in the spare group weren't being utilized. Long story short - I actually recompiled my own mdadm with fprintf's all over the place and found mdadm behaving strangely. Thing is, this is mdadm version 2.6.2 we're talking about here, and mdadm is already up to version 2.6.4. So I download that, and it worked, so apparently this bad behavior has been fixed. But Fedora doesn't have the latest version available yet, at least via "yum update," so we're pretty much waiting on the new version to become available before implementing a less trusted version, even if it seems to work better. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Dr. C.E.T.I. Send message Joined: 29 Feb 00 Posts: 16019 Credit: 794,685 RAC: 0 |
. . . Thanks for the Posting Matt - figure 'Murphy' played a bit role eh ;) > Each of You Keep up the good work @ Berkeley . . . BOINC Wiki . . . Science Status Page . . . |
aplayer Send message Joined: 26 Apr 00 Posts: 13 Credit: 15,217,341 RAC: 0 |
Another job well done. Ty for the news. |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30758 Credit: 53,134,872 RAC: 32 |
You have to watch out for those cosmic rays. With memory getting as dense as it is today a single particle can switch a bit or two. Better get a stronger magnetic field around the building to deflect them. :) TY for the post and info. Things have been running rather well over the past couple of weeks. Having effectively unlimited bandwidth really helps. It's a little more hectic behind the scenes as new data keeps getting sent up from Arecibo - we are continually working to offload the data to our local servers (and remote mass storage) so we can send back the blank drives for more. Steps will be taken soon to improve this situation (namely: sending some data to our remote storage via our faster Hurricane connection). |
AndyW Send message Joined: 23 Oct 02 Posts: 5862 Credit: 10,957,677 RAC: 18 |
|
Andy Lee Robinson Send message Joined: 8 Dec 05 Posts: 630 Credit: 59,973,836 RAC: 0 |
You have to watch out for those cosmic rays. With memory getting as dense as it is today a single particle can switch a bit or two. Better get a stronger magnetic field around the building to deflect them. :) Some particles can do as much damage as a baseball bat in a biscuit factory, as far as chips are concerned! Nothing will stop them. I don't think a big field would help that much... it could make more dangerous particles out of those that would have missed without it, and the kind with such high energies that can damage or influence a chip probably wouldn't even notice the field. If it worked, you'd probably need a field so strong that the electrons in the chips would lose their balance and fall off the wires.. Lead lining and ECC offer more hope! |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30758 Credit: 53,134,872 RAC: 32 |
You have to watch out for those cosmic rays. With memory getting as dense as it is today a single particle can switch a bit or two. Better get a stronger magnetic field around the building to deflect them. :) Hey, I know I'm talking about magnets the size of the LHC. After all the particles are doing well over 99% of C, so bending their flight path will take a bit of power. After all they already got through the earth's magnetic field and a few dozen miles of air, and a building, so I don't think a little lead lining will make any difference to them. Got to have that 100% up time you know. ;) |
Andy Lee Robinson Send message Joined: 8 Dec 05 Posts: 630 Credit: 59,973,836 RAC: 0 |
Some of the most energetic particles literally have the energy of a tennis ball at first serve, but fortunately very rare. It's the slower secondary particles after a collision with a dense object that can do the damage. The magnetic field in a synchrotron is very strong, and very confined, and the particles in the ring possess the same polarity. They damage the collider if they go off course, so the timing strength and precision is crucial (hence LHC@Home!). I don't think one could make such a strong magnetic field to encompass a whole building without a nearby dedicated power station, and any hard disks in a 100m radius probably wouldn't approve, nor the staff or their credit cards! What repels one polarity, attracts the other... I'd move it a km underground where natural radiation in the rocks can be easily blocked, but this can push up the commuting time a little! For 100% uptime, I'd do a multiple-master mysql configuration with lots and lots of ECC ram, dual psus and UPS each with independent power supplies from two separate providers for each half of the system, but that starts to get expensive... Then you can bring a database down for back up or repair while the other server(s) is(are) still running. |
ML1 Send message Joined: 25 Nov 01 Posts: 20514 Credit: 7,508,002 RAC: 20 |
See: Should every computer chip have a cosmic ray detector? Note that Boinc is protected from the effects of this to some extent by the quorum checking. Happy crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
Andy Lee Robinson Send message Joined: 8 Dec 05 Posts: 630 Credit: 59,973,836 RAC: 0 |
|
Dr. C.E.T.I. Send message Joined: 29 Feb 00 Posts: 16019 Credit: 794,685 RAC: 0 |
. . . so, it's the Cosmic Rays freezin' my system every-now & then - wondered what was goin' on < maybe more research is needed here: Hard drive wobbles track earthquake spread > isn't somebody @ Berkeley still doing Research regarding ? ps - Thanks Martin . . . BOINC Wiki . . . Science Status Page . . . |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 |
Check again; lead is a good source of (alpha) radiation, which also flips your bits. Manufacturers try to use isotope enriched lead to reduce the risk, but that is expensive and not fool-proof. ECC and clever design is the answer, short of going to a wider bandgap material (non-Si). |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Having effectively unlimited bandwidth really helps. That's great to hear about. Did I miss some think in one of Matt's post's? If not how did Seti manage to get almost free bandwidth? Cheers Speedy |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
Having effectively unlimited bandwidth really helps. He didn't say "free" - "effectively unlimited" just means a fatter pipe that doesn't get max'd out. F. |
[KWSN]John Galt 007 Send message Joined: 9 Nov 99 Posts: 2444 Credit: 25,086,197 RAC: 0 |
Having effectively unlimited bandwidth really helps. Techincally, not a fatter pipe, but a bigger faucet (router). But, who's technical anymore? ;-)> Clk2HlpSetiCty:::PayIt4ward |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
Oops! We...ll, er, that word is in my CV... [Slapped wrist - hangs head.] Of course, you are right. Thanks. F. |
Mr. Majestic Send message Joined: 26 Nov 07 Posts: 4752 Credit: 258,845 RAC: 0 |
|
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
[/quote] I now understand thank you for the reminder. Hope you all have a happy crunching weekend. Cheers Speedy |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.