Message boards :
Technical News :
Results (Mar 11 2015)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
How about a new thread? Last night I noticed the assimilators were failing, and this led to the usual conclusion: we ran out of extents on a table in the master database - this time the result table itself. Each result has a very basic entry in this table - it's basically a bridge between its signals and the parent workunit. Anyway, the solution is simple but painful - we gotta rebuild the whole table again. Luckily, we can do this in parallel with inserting new stuff. So I made a new result table this morning (hence the assimilator queues backing up in the meantime) and then over the next weeks (months) I can silently shovel the older results into this new table. There is a balancing act between size limits and performance. When building these tables we hope to build them big enough so we don't run into these logical barriers, but not so big they don't perform. Sometimes we aim too low, and then we hit these barriers (like running out of extents). In the case of the Astropulse database, I'm still making a massive backup copy of the whole database as it lives on disk (about 13TB) to archival storage before I do any of the next steps. The plan is still such as it is last I mentioned it - we will begin the reloads from the beginning using a new method that will only take about 2 months. In the meantime we will set up another functionally temporary db to insert new Astropulse data, which we will then merge again after these 2 months (or however long it takes). So we may see AP back on line again in the near future. Anyway, it's still a mess, but there's slow/steady/cautious progress. I'm keeping "resend lost results" off for now - this functionality has been clobbering the BOINC/mysql database for a while. I think this is partially due to that database being a bit bloating with undigested workunits (i.e. stuff that's been stuck a while due to the Astropulse issues). - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Dad Send message Joined: 21 May 99 Posts: 44 Credit: 35,266,844 RAC: 10 |
Thanks for the info |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Thanks for the update Matt, Do you know what's happening at Seti Beta with the 'WUs with very unusual Autocorr parameters'? and why the scheduler there is still down for maintenance? Edit: Of cause, as soon as I post that, it's no longer down for maintenance. Claggy |
SciManStev Send message Joined: 20 Jun 99 Posts: 6657 Credit: 121,090,076 RAC: 0 |
Matt, these updates from you and/or the staff go light years with us crunchers. Thank you! Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website |
Julie Send message Joined: 28 Oct 09 Posts: 34060 Credit: 18,883,157 RAC: 18 |
|
_ Send message Joined: 15 Nov 12 Posts: 299 Credit: 9,037,618 RAC: 0 |
Matt, these updates from you and/or the staff go light years with us crunchers. Yep, thanks for the update! I can't wait for some AP :) |
Les Binns Send message Joined: 9 Oct 99 Posts: 62 Credit: 32,652,655 RAC: 1 |
Yes Thanks Matt |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Minor update: The result table is being merged nicely at this point. Not sure how long it'll take, but we'll monitor its progress. The astropulse project is also moving forward - I'm back to rebuilding the signal table on the temporary server (the part that will take 6-8 weeks) and am cleaning up/backing up the main server - we might have Astropulse rolling again tomorrow, but more likely Monday. We'll cross that bridge of merging these two databases 8 weeks from now. In case you're worried this is eating up all my time, it's not. I'm actually mostly working on playing with SERENDIP VI data, making sure it works, helping Jeff track down networking problems between the data sources and the compute nodes down at Arecibo... - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
JaundicedEye Send message Joined: 14 Mar 12 Posts: 5375 Credit: 30,870,693 RAC: 1 |
Great update Matt, can't wait to get some AP's again. Thanks for keeping us informed and for the dedication to the project. "Sour Grapes make a bitter Whine." <(0)> |
Cliff Harding Send message Joined: 18 Aug 99 Posts: 1432 Credit: 110,967,840 RAC: 67 |
There is something that has bothered me for a very long time - 1) How large are the table extents. Are they at their maximum and are succeeding extents after the first n (i.e. 7), the same or progressively lesser then the first and are each succeeding group of extents being done in the same manner? 2) Is IDS (Informix Dynamic Server data compression and storage optimization data compression) being utilized? I don't buy computers, I build them!! |
Wild6-NJ Send message Joined: 4 Aug 99 Posts: 43 Credit: 100,336,791 RAC: 140 |
AP is up and starting to validate the backlog. Yay!!! |
Julie Send message Joined: 28 Oct 09 Posts: 34060 Credit: 18,883,157 RAC: 18 |
|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Okay.. AP, such as it is, is back and creating/assimilating new work. Good. Of course while revving up all these engines lando decided to have a little NFS freakout. I just had to hard power cycle it but it's behaving now. I'll post a more formal news item on Monday, once we're sure things are stable... - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Thanks Matt !! |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Thank you Matt. Claggy |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Oh now it looks like georgem is crashing too with the same NFS errors. Ha ha. Happy Friday afternoon... - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30927 Credit: 53,134,872 RAC: 32 |
Oh now it looks like georgem is crashing too with the same NFS errors. Ha ha. Happy Friday afternoon... Ouch, everything bad happens on Friday near quitting time. :( |
Julie Send message Joined: 28 Oct 09 Posts: 34060 Credit: 18,883,157 RAC: 18 |
|
ivan Send message Joined: 5 Mar 01 Posts: 783 Credit: 348,560,338 RAC: 223 |
Oh now it looks like georgem is crashing too with the same NFS errors. Ha ha. Happy Friday afternoon... In my extensive experience as an experimental physicist at various institutes around the world, everything bad happens on a Friday afternoon after the technicians have left, but before the scientists start knocking off for the weekend. |
rob smith Send message Joined: 7 Mar 03 Posts: 22447 Credit: 416,307,556 RAC: 380 |
...Things always break just after the support staff reach the "Point of Unreachablilty". Thanks for your efforts Matt et-al, its good to see APs back in production :-) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.