Results (Mar 11 2015)

Message boards : Technical News : Results (Mar 11 2015)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 1651821 - Posted: 11 Mar 2015, 21:09:20 UTC

How about a new thread?

Last night I noticed the assimilators were failing, and this led to the usual conclusion: we ran out of extents on a table in the master database - this time the result table itself. Each result has a very basic entry in this table - it's basically a bridge between its signals and the parent workunit. Anyway, the solution is simple but painful - we gotta rebuild the whole table again.

Luckily, we can do this in parallel with inserting new stuff. So I made a new result table this morning (hence the assimilator queues backing up in the meantime) and then over the next weeks (months) I can silently shovel the older results into this new table. There is a balancing act between size limits and performance. When building these tables we hope to build them big enough so we don't run into these logical barriers, but not so big they don't perform. Sometimes we aim too low, and then we hit these barriers (like running out of extents).

In the case of the Astropulse database, I'm still making a massive backup copy of the whole database as it lives on disk (about 13TB) to archival storage before I do any of the next steps. The plan is still such as it is last I mentioned it - we will begin the reloads from the beginning using a new method that will only take about 2 months. In the meantime we will set up another functionally temporary db to insert new Astropulse data, which we will then merge again after these 2 months (or however long it takes). So we may see AP back on line again in the near future. Anyway, it's still a mess, but there's slow/steady/cautious progress.

I'm keeping "resend lost results" off for now - this functionality has been clobbering the BOINC/mysql database for a while. I think this is partially due to that database being a bit bloating with undigested workunits (i.e. stuff that's been stuck a while due to the Astropulse issues).

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1651821 · Report as offensive
Dad
Volunteer tester

Send message
Joined: 21 May 99
Posts: 44
Credit: 35,266,844
RAC: 10
United States
Message 1651874 - Posted: 11 Mar 2015, 22:56:13 UTC

Thanks for the info
ID: 1651874 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1651876 - Posted: 11 Mar 2015, 23:00:55 UTC - in response to Message 1651821.  
Last modified: 11 Mar 2015, 23:02:49 UTC

Thanks for the update Matt,

Do you know what's happening at Seti Beta with the 'WUs with very unusual Autocorr parameters'?

and why the scheduler there is still down for maintenance?

Edit: Of cause, as soon as I post that, it's no longer down for maintenance.

Claggy
ID: 1651876 · Report as offensive
Profile SciManStev Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Jun 99
Posts: 6651
Credit: 121,090,076
RAC: 0
United States
Message 1651880 - Posted: 11 Mar 2015, 23:05:51 UTC

Matt, these updates from you and/or the staff go light years with us crunchers.
Thank you!

Steve
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website
ID: 1651880 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 34041
Credit: 18,883,157
RAC: 18
Belgium
Message 1651914 - Posted: 12 Mar 2015, 0:28:27 UTC

Thanx for the update Matt.
rOZZ
Music
Pictures
ID: 1651914 · Report as offensive
_
Avatar

Send message
Joined: 15 Nov 12
Posts: 299
Credit: 9,037,618
RAC: 0
United States
Message 1652190 - Posted: 12 Mar 2015, 19:57:50 UTC - in response to Message 1651880.  

Matt, these updates from you and/or the staff go light years with us crunchers.
Thank you!

Steve


Yep, thanks for the update! I can't wait for some AP :)
ID: 1652190 · Report as offensive
Les Binns
Volunteer tester
Avatar

Send message
Joined: 9 Oct 99
Posts: 62
Credit: 32,652,655
RAC: 1
United Kingdom
Message 1652237 - Posted: 12 Mar 2015, 22:45:14 UTC - in response to Message 1651821.  

Yes Thanks Matt
ID: 1652237 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 1652252 - Posted: 12 Mar 2015, 23:26:56 UTC

Minor update:

The result table is being merged nicely at this point. Not sure how long it'll take, but we'll monitor its progress.

The astropulse project is also moving forward - I'm back to rebuilding the signal table on the temporary server (the part that will take 6-8 weeks) and am cleaning up/backing up the main server - we might have Astropulse rolling again tomorrow, but more likely Monday. We'll cross that bridge of merging these two databases 8 weeks from now.

In case you're worried this is eating up all my time, it's not. I'm actually mostly working on playing with SERENDIP VI data, making sure it works, helping Jeff track down networking problems between the data sources and the compute nodes down at Arecibo...

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1652252 · Report as offensive
Profile JaundicedEye
Avatar

Send message
Joined: 14 Mar 12
Posts: 5375
Credit: 30,870,693
RAC: 1
United States
Message 1652259 - Posted: 12 Mar 2015, 23:46:17 UTC

Great update Matt, can't wait to get some AP's again.

Thanks for keeping us informed and for the dedication to the project.

"Sour Grapes make a bitter Whine." <(0)>
ID: 1652259 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1652455 - Posted: 13 Mar 2015, 11:10:14 UTC

There is something that has bothered me for a very long time -

1) How large are the table extents. Are they at their maximum and are succeeding extents after the first n (i.e. 7), the same or progressively lesser then the first and are each succeeding group of extents being done in the same manner?

2) Is IDS (Informix Dynamic Server data compression and storage optimization data compression) being utilized?


I don't buy computers, I build them!!
ID: 1652455 · Report as offensive
Wild6-NJ
Volunteer tester

Send message
Joined: 4 Aug 99
Posts: 43
Credit: 100,336,791
RAC: 140
Message 1652633 - Posted: 13 Mar 2015, 21:58:23 UTC

AP is up and starting to validate the backlog. Yay!!!
ID: 1652633 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 34041
Credit: 18,883,157
RAC: 18
Belgium
Message 1652636 - Posted: 13 Mar 2015, 22:07:41 UTC - in response to Message 1652633.  

AP is up and starting to validate the backlog. Yay!!!


Woohoo!!
rOZZ
Music
Pictures
ID: 1652636 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 1652643 - Posted: 13 Mar 2015, 22:31:26 UTC
Last modified: 13 Mar 2015, 22:31:43 UTC

Okay.. AP, such as it is, is back and creating/assimilating new work. Good. Of course while revving up all these engines lando decided to have a little NFS freakout. I just had to hard power cycle it but it's behaving now. I'll post a more formal news item on Monday, once we're sure things are stable...

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1652643 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1652648 - Posted: 13 Mar 2015, 22:37:11 UTC - in response to Message 1652643.  

Thanks Matt !!
ID: 1652648 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1652649 - Posted: 13 Mar 2015, 22:39:59 UTC - in response to Message 1652643.  

Thank you Matt.

Claggy
ID: 1652649 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 1652652 - Posted: 13 Mar 2015, 22:54:59 UTC

Oh now it looks like georgem is crashing too with the same NFS errors. Ha ha. Happy Friday afternoon...

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1652652 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30591
Credit: 53,134,872
RAC: 32
United States
Message 1652657 - Posted: 13 Mar 2015, 22:59:48 UTC - in response to Message 1652652.  

Oh now it looks like georgem is crashing too with the same NFS errors. Ha ha. Happy Friday afternoon...

- Matt

Ouch, everything bad happens on Friday near quitting time. :(
ID: 1652657 · Report as offensive
Profile Julie
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 28 Oct 09
Posts: 34041
Credit: 18,883,157
RAC: 18
Belgium
Message 1652665 - Posted: 13 Mar 2015, 23:16:01 UTC - in response to Message 1652657.  

Oh now it looks like georgem is crashing too with the same NFS errors. Ha ha. Happy Friday afternoon...

- Matt

Ouch, everything bad happens on Friday near quitting time. :(


And it is (was here) the 13th, good to see marvin up and running.
rOZZ
Music
Pictures
ID: 1652665 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1652671 - Posted: 14 Mar 2015, 0:01:21 UTC - in response to Message 1652657.  

Oh now it looks like georgem is crashing too with the same NFS errors. Ha ha. Happy Friday afternoon...

- Matt

Ouch, everything bad happens on Friday near quitting time. :(

In my extensive experience as an experimental physicist at various institutes around the world, everything bad happens on a Friday afternoon after the technicians have left, but before the scientists start knocking off for the weekend.
ID: 1652671 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22149
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1653130 - Posted: 15 Mar 2015, 10:03:14 UTC

...Things always break just after the support staff reach the "Point of Unreachablilty".
Thanks for your efforts Matt et-al, its good to see APs back in production :-)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1653130 · Report as offensive

Message boards : Technical News : Results (Mar 11 2015)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.