Message boards :
Technical News :
A Billion or So (Jan 06 2010)
Message board moderation
Author | Message |
---|---|
![]() ![]() Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 ![]() |
Still catching up from the reduced/random schedule during the holidays. The science database rehabilitation project still continues. We're nearing the end: the primary science database (thumper) is now corruption free, stable, and logging properly. The secondary science database (bambi) is being rebuilt as I type using the science database backup we made on Monday. The rebuilding is going rather slowly - we predict it will take 11 days (!) at current rates. As I typed this paragraph we noticed the rebuild was stuck. We feared we had to reboot the system and start again from scratch but luckily we were able to find the errant process locking the whole system, and everything else sprung to life, continuing where it left off. Phew. By the way... not to rain on the parade, but during the holidays one of the drives in thumper's RAID issued some warnings. Last time that happened we got some, well, um... corruption. I doubt we'll have to go through this whole rigamarole again. If anything, just a small part of the cookbook. Ah, probably not worth worrying about. We'll run some checks when all the above is through and see where we're at. In better news, I got scram_peek working again. What's that? It's a little utility that runs down at the telescope and reads various diagnostics as they are broadcast around the local net. Stuff like current telescope position, if alfa is running, etc.. This hasn't been working since our data recorder issues a loooong time ago, so our science status page (where we post such info) has been rather stale. One major stumbling block was the old scram_peek ran on a solaris machine, but that particular system died. We had no other solaris system handy so I had to recompile it on linux. It's really old code, linking against even older libraries. I had some compiler errors to work through - annoying but nothing too extreme. Anyway, I'm looking at the science status page right now and the ALFA receiver light is green. That's beautiful. You may also notice the # of spikes in the science database is shockingly low. That's because we recently split the spike table into two (it grew beyond the bounds a single logical table could handle). We'll combine them again at a later point. Until then, that number is off by a billion or so (1,341,844,240 to be exact). - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
![]() ![]() Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 ![]() |
Thanks for the news Matt, glad most things are doing OK. Did you see my questions in http://setiathome.berkeley.edu/forum_thread.php?id=56160? What happened to Mork a few days ago? Have you eliminated any intermittant hardware problems? |
![]() ![]() Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 ![]() |
Right... regarding mork we still have no clue. I don't think it's power - the system is just hanging there in a frozen stateuntil we have to hard reset it. Maybe that's a symptom to a power problem I've never seen before. Anyway, it's an "engineering model" system so all bets are off, really. The good news on that front, if anything, is that Jeff and I finally incorporated an IP enabled power switch so we can at least fully power cycle the thing from the comfort of our homes next time this happens off hours. It's still no good to have your master user database crash and recover every other week. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 ![]() |
Thanks for the update Matt, Claggy Edit: Strangly enough, i was an Avionics Technician in the 80's as well :) |
![]() ![]() Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 ![]() |
This is just a theory, as I have not seen the system, you might have some AC ripple leaking through onto a DC line, due to a filter breaking down. Is it possible to swap out a power unit? I used to be an avionics tech in the Air Force back in the '80s. I have seen a few wierd snags get tracked down to bad power or bad grounding or earthing. Keith T. |
Luke Send message Joined: 31 Dec 06 Posts: 2546 Credit: 817,560 RAC: 0 ![]() |
Great work Matt! Is there perhaps any chance of an update at the S@H Data Distribution History page? Looks like the last time it was updated was in August 2008... - Luke. - Luke. |
![]() ![]() Send message Joined: 30 Aug 05 Posts: 452 Credit: 142,832,523 RAC: 94 ![]() ![]() |
Cant...resist... <sagan> a billion or so </sagan> :D mambo |
Peter Jeremy Send message Joined: 18 May 99 Posts: 1 Credit: 1,821,011 RAC: 0 ![]() |
one of the drives in thumper's RAID issued some warnings. Last time that happened we got some, well, um... corruption. This has been suggested before but why not use ZFS on the Thumper (and other fileservers)? Having an additional end-to-end checksum is very useful when your data volume starts to approach typical disk bit-error rates. You can also run a data verify ("scrub") to ensure that the data you read off the drives matches the expected data. |
DJStarfox Send message Joined: 23 May 01 Posts: 1066 Credit: 1,226,053 RAC: 2 ![]() |
It's still no good to have your master user database crash and recover every other week. Maybe you'd be better of switching it to be the secondary instead? |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 ![]() |
Is there any chance whatsoever to run some of these servers as virtual servers, thereby increasing the system's flexibility? It seems possible that one of the VM providers may be interesting in the experiment and would provide 'free' software to such a visible project. With any luck it may make Matt's work-experience a little less anxious! |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 ![]() |
Is there any chance whatsoever to run some of these servers as virtual servers, thereby increasing the system's flexibility? It seems possible that one of the VM providers may be interesting in the experiment and would provide 'free' software to such a visible project. With any luck it may make Matt's work-experience a little less anxious! Virtual servers are only useful if: 1) The servers are not completely saturated (virtualizing things does take extra CPU and disk resources). 2) There are processes that come and go on an irregular basis. Neither of these is true in this case. ![]() ![]() BOINC WIKI |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 ![]() |
I was thinking of the capability of migrating the VM between hardware platforms. Isn't that a useful property, as well? |
![]() Send message Joined: 28 Sep 02 Posts: 362 Credit: 16,590,653 RAC: 0 ![]() |
With all the servers running at full load, where could you migrate the VM to? I was thinking of the capability of migrating the VM between hardware platforms. Isn't that a useful property, as well? mic. ![]() |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.