Return (Feb 16 2012)


log in

Advanced search

Message boards : Technical News : Return (Feb 16 2012)

Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1391
Credit: 74,079
RAC: 10
United States
Message 1196110 - Posted: 16 Feb 2012, 21:04:27 UTC

Hello gang. I'm back from the latest bout of alternative career maintenance. Seems like I didn't miss too much, and unlike normal the server problems waited until *after* I returned. My next disappearance (only about 10 days) will be in mid-April (touring in Argentina, Chile, and Brazil).

Before the usual Tuesday server outage Jeff noticed the splitters having trouble inserting new work into the science database. After some detective work and tests we found we hit one of several possible informix logical limits: we ran out of extents in the workunit table.

Not a big deal, and we hit this limit with other tables several times before. But the fix is a bit of a hassle. Basically you have to recreate a whole new table from scratch with more extents and repopulate it with all the data from the "full" table. We have a billion workunits in that table, so to speed this process up we only moved over workunits 90 days old (or newer) before turning the projects on again. We only need 90 days of recent workunits around for the assimilators to work, but to get the NTPCkrs rolling again we need to repopulate the whole thing, which we'll do more casually.

Not sure if anybody noticed, but I got the "connecting client types" page working again (for the umpteenth time). Let's see how long before it breaks again for some inexplicable reason: http://setiathome.berkeley.edu/client_types.php

Okay. I'm sure there's lots more to report but I'm going back to beating down my e-mail spool.

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile perryjay
Volunteer tester
Avatar
Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 16,349,078
RAC: 8,586
United States
Message 1196112 - Posted: 16 Feb 2012, 21:17:17 UTC - in response to Message 1196110.

Welcome back Matt, hope you had a good time.
____________


PROUD MEMBER OF Team Starfire World BOINC

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4238
Credit: 34,918,901
RAC: 24,264
United Kingdom
Message 1196113 - Posted: 16 Feb 2012, 21:21:28 UTC - in response to Message 1196110.

Thanks for the update Matt, welcome back,

Claggy

QSilver
Send message
Joined: 26 May 99
Posts: 232
Credit: 4,846,834
RAC: 1,012
United States
Message 1196122 - Posted: 16 Feb 2012, 22:20:13 UTC

Welcome back, Matt, and thanks for the quick update.

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2355
Credit: 8,939,664
RAC: 4,126
United States
Message 1196137 - Posted: 16 Feb 2012, 23:45:21 UTC

Welcome back. It's nice to hear what's going on behind the scenes.

Since we're talking about database stuff.. is there anything that can be done for "stuck" WUs that have been pending for several months, or in some cases..years? They are ones where _0 and _1 got credit granted before _2 returned their result, and therefore, _2 is stuck waiting.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Profile Michel448a
Volunteer tester
Avatar
Send message
Joined: 27 Oct 00
Posts: 1201
Credit: 2,891,635
RAC: 0
Canada
Message 1196195 - Posted: 17 Feb 2012, 2:45:26 UTC - in response to Message 1196110.

welcome back Matt !!



____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4346
Credit: 1,122,698
RAC: 695
United States
Message 1196397 - Posted: 17 Feb 2012, 17:47:58 UTC - in response to Message 1196137.

Welcome back. It's nice to hear what's going on behind the scenes.

Since we're talking about database stuff.. is there anything that can be done for "stuck" WUs that have been pending for several months, or in some cases..years? They are ones where _0 and _1 got credit granted before _2 returned their result, and therefore, _2 is stuck waiting.

Examples found at the end of the pending lists of the current top 20 hosts, WUs 764386014, 783672952, 785186126, 785467923, 785746766, 798674557, 802307404, 805724125, 806011986, 811044806, and 836743548.

As the last activity on all of those is more than 90 days ago, doing something now might not be sensible.
Joe

Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar
Send message
Joined: 20 Dec 05
Posts: 2002
Credit: 11,169,617
RAC: 12,833
United States
Message 1196429 - Posted: 17 Feb 2012, 19:46:09 UTC
Last modified: 17 Feb 2012, 19:50:21 UTC

Thanks for getting that page back online... the data in it were really old (around October 2010, IIRC) and didn't include a lot of the more modern versions of the BOINC client...

Next low-priority thing to work on: getting the telescope pointing data on the "Science Status" page back working.
____________
.

Message boards : Technical News : Return (Feb 16 2012)

Copyright © 2014 University of California