Message boards :
Technical News :
Bumpy Ride (Jun 03 2008)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Good news. The science database problems were far less severe than we thought. Short story: we ran out of space. Long story: due to a slightly confusing configuration we thought we ran out of extents for reasons unclear. Informix categorizes all usable storage space into dbspaces, fragments, chunks, extents... maybe more things I'm not sure. We've had problems in the past where we ran out of extents long before running out of actual disk space and we thought this is what happened again. The solution for such is painful - basically like rebuilding a RAID system (unload everything, recreate, and reload). Luckily we discovered we had some fragments/chunks misaligned (some fragments had more chunks than others) so all we had to do was add more chunks, and we had plenty of disk space for that. We added enough to get by for now, and will do more when we catch up from the queue draining/filling. We had our usual outage today (for BOINC database backup/compression, etc.). Between the usual recovery for that and the recovery for all the above it may be a bumpy ride for the next 24 hours or so. Yesterday afternoon server "bane" (one of the two download servers) was having mounting issues which required a reboot to clean up. I was home at the time and rebooted it remotely. Of course, like my desktop last week, a new kernel was yum'ed in during the recent past and messed up grub for some reason, so it wouldn't load the OS. I had to get Jeff, who was still at the lab, to deal with booting from the emergency DVD and boot from an older kernel. While bane was down half the downloads connections were failing, but usually retries were successful as we have the two redundant servers. Today I got server anakin more officially racked up (actually just sitting in a rack directly on top of a UPS) to ultimately become the new scheduler. It's a recently donated Dual Xeon (used) that is actually less powerful than our current scheduler, ptolemy, but should be able to handle the job just fine. We plan on making ptolemy, with its 16 mostly unused drive bays, a network storage server to replace our ageing Network Appliance server, which fell out of service long ago and its many drives are dying with regularity - infrequent but still worrisome. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Dr. C.E.T.I. Send message Joined: 29 Feb 00 Posts: 16019 Credit: 794,685 RAC: 0 |
. . . well - lots of great News eh Matt, Thanks for Posting Sir! > and Thanks to everyone else up there @ Berkeley - it's All appreciated . . . |
gomeyer Send message Joined: 21 May 99 Posts: 488 Credit: 50,370,425 RAC: 0 |
Good news indeed. Thanks for the update. . . . Just one small favor; could you kick the Server Status stats? The time is updating, but the data and 'as of' do not seem to be. Of course if this will slow recovery then please forget it for now. Later . . Don't know if you did something, but it look OK now. Thanks again. |
Keith T. Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 |
Good news. The science database problems were far less severe than we thought. Short story: we ran out of space. Long story: due to a slightly confusing configuration we thought we ran out of extents for reasons unclear. Informix categorizes all usable storage space into dbspaces, fragments, chunks, extents... maybe more things I'm not sure. We've had problems in the past where we ran out of extents long before running out of actual disk space and we thought this is what happened again. The solution for such is painful - basically like rebuilding a RAID system (unload everything, recreate, and reload). Luckily we discovered we had some fragments/chunks misaligned (some fragments had more chunks than others) so all we had to do was add more chunks, and we had plenty of disk space for that. We added enough to get by for now, and will do more when we catch up from the queue draining/filling.... As AstroPulse is likely to be arriving on SETI main in the not too distant future, how is that likely to affect space considerations ? Specifically AstroPulse WU's are around 8MB compared to 367kB for MultiBeam WU's, and run time is likely to be significantly longer, i.e. probably at least 25 times longer run times. Sir Arthur C Clarke 1917-2008 |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 |
Good news. The science database problems were far less severe than we thought. Short story: we ran out of space. Long story: due to a slightly confusing configuration we thought we ran out of extents for reasons unclear. Informix categorizes all usable storage space into dbspaces, fragments, chunks, extents... maybe more things I'm not sure. We've had problems in the past where we ran out of extents long before running out of actual disk space and we thought this is what happened again. The solution for such is painful - basically like rebuilding a RAID system (unload everything, recreate, and reload). Luckily we discovered we had some fragments/chunks misaligned (some fragments had more chunks than others) so all we had to do was add more chunks, and we had plenty of disk space for that. We added enough to get by for now, and will do more when we catch up from the queue draining/filling.... I know these questions don't apply here, so redirect me as needed. But ... 1) will seti continue with the 'standard' wu's in the foreseable future, or will we all have to convert to the AP app. 2) will the optimized application set be tossed out and our 'optimizers' will have to start from scratch 3) when will the chaos commence? |
Keith T. Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 |
See Message boards : AstroPulse. The thread Astropulse FAQ might be a good place to start. 1). As far as I know, SETI@home will continue to process MultiBeam data alongside the new AstroPulse apps. 2). Optimized apps will still be usable on MultiBeam WU's. I have also heard rumours that there are likely to be Optimized AstroPulse apps eventually. 3). I hope that it won't be chaos. If you want to know more, you are quite welcome to attach to SETI Beta where the AP 4.33 apps were just released yesterday. n.b. No official Mac apps yet, but I know that some Mac testing is going on. Keith. [small edit] |
Pappa Send message Joined: 9 Jan 00 Posts: 2562 Credit: 12,301,681 RAC: 0 |
Keith Actually you did a great job of letting the cat out of the bag (Grin). The only thing I would add as Astropulse is released into the wild here at Seti. Seti Beta goes back to testing Version 6.x of the MultiBeam Application. As for "rumor," Astropulse should reach Seti in the month of June. Here shortly, I will setup a conversation thread...
Please consider a Donation to the Seti Project. |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
Warnings about Astropulse: As mentioned earlier up-thread, the WU d/l is 8 Mb. If the new version is as efficient/inefficient as the prior incarnations (I haven't gotten a 4.33 WU yet, I'm still working a 4.32...) then an Opteron-165 (my Beta computer) will (supposedly) take in the neighborhood of 53 hours to process a WU. (I can't say for certain, as every WU I've gotten has ended early, some only about 2% in.) I'm sure that an optimized app will appear at some point, after the code is released. Whether that app will give as much speed-up as the SETI apps do is open to question. Beta tester . Hello, from Albany, CA!... |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.