Message boards :
Technical News :
And Up Again (Sep 11 2008)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
So we hit that brick wall again with the science database - that is, when we try to create a new index it works fine on the primary server but then clogs up sending the new index pages to the secondary. This clog locks up the database, the splitters grind to a halt, the assimilators grind to a halt, i.e. fun for everybody! We thought we were out of the woods yesterday afternoon but checking in at 1am last night (this morning?) I saw this all happening again, so I gave things a swift kick and went to bed. This morning, once we were all here at the lab, we decided to just bite the bullet this time and shut down all the splitters/assimilators and let the clog work through naturally on its own, which it did. We also took the down time to do an "update statistics" on one signal table (this helps re-sort current indexes for speedier lookups) and add disk space for said indexes. I just turned things back on, we'll be catching up for a while, etc. I did do some qlogic card testing today which got us over my "information gathering and training" hurdle so we can upgrade the remaining two servers with old OS's in the coming weeks. We also got our homemade NAS configured so that we may get the old NetApp rack out of the closet maybe next week. It's still working quite reliably, but it's taking up a third of our closet space, a seventh of our power, but delivering only 2 TB of raw disk space. Not really efficient, and we have a *lot* of servers waiting to get into the closet already. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
squishymaster Send message Joined: 21 Oct 03 Posts: 34 Credit: 784,496 RAC: 0 |
Thanks for the update Matt. I wonder about this problem that keeps coming up. Is it because there isn't enough hard drive space? Or because the servers aren't fast enough? |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Is it because there isn't enough hard drive space? Or because the servers aren't fast enough? Still unclear to me. Bob's looking into it. Could be we have simple configuration screws to tighten, or that the server acting as the primary science database also has 48 drives, so we use up a lot of extra resources transferring raw data to/from there. Maybe it's the RAID configuration on the database/raw data drives (maximized for storage, not for speed). - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
squishymaster Send message Joined: 21 Oct 03 Posts: 34 Credit: 784,496 RAC: 0 |
So it would seem that the problem is too many results coming in too quickly? Wouldn't it be possible to create a bottle neck that would only allow so many results coming in at a time and the rest would have to wait until the way was clear? |
Dr. C.E.T.I. Send message Joined: 29 Feb 00 Posts: 16019 Credit: 794,685 RAC: 0 |
. . . nice work from all @ Berkeley - Thanks for the Posting Matt BOINC Wiki . . . Science Status Page . . . |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
So it would seem that the problem is too many results coming in too quickly? Wouldn't it be possible to create a bottle neck that would only allow so many results coming in at a time and the rest would have to wait until the way was clear? Nope. We can insert results probably 10 times as fast as we do now. In fact, the bottleneck there would be our bandwidth for workunits going out (you have to download a workunit to ultimately send a result). The current issues deal with stuff on the system outside of creating workunits and inserting results - like storing 8 TB of raw data there, creating indexes for future scientific analysis, etc. This *is* a problem, but not exactly due to too many results. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
got the following when I tried to update and report 5 WU's: 9/12/2008 10:33:39 AM|SETI@home|Scheduler request failed: HTTP internal server error This with client 5.10.30... not the client that usually gives this error. . Hello, from Albany, CA!... |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
.xml stats export apparently not working again... no updates on the stats sites for the last two days. . Hello, from Albany, CA!... |
speedimic Send message Joined: 28 Sep 02 Posts: 362 Credit: 16,590,653 RAC: 0 |
Looking at the server status page, it seems they are working on it... mic. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.