Hinke (Apr 20 2009)


log in

Advanced search

Message boards : Technical News : Hinke (Apr 20 2009)

1 · 2 · Next
Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1389
Credit: 74,079
RAC: 0
United States
Message 886811 - Posted: 20 Apr 2009, 23:04:44 UTC

The mysql database crashed on Friday, then again on Saturday. The reasons are mysterious, though we've had similar crashes in the past - just not two in immediate succession like that. Most of the large, important tables (user, host, workunit, result) are using the innodb engine, while the many others (including team, forum preferences, posts, etc.) are using mysql's standard myisam engine. There's worry we may have lost a few rows in some of the myisam tables, though they seem to check out okay. The replica database, though, is in a confused state so we just shut it off for the time being. We're going to save any remaining cleanup for tomorrow's usual outage. As stated elsewhere, Jeff and I have adopted a policy of no-system-changes (except for emergencies) until after the anniversary. So as long as mysql continues to run well, we're not going to worry about this so much.

I know I write all these missives and therefore I get the brunt of the accolades (or otherwise) but Jeff/Bob pretty much took care of the entire mess above. I did log in on Sunday and cleaned up the server status page and the validators (which for some reason *have* to start on the command line, as opposed to the usual cron job which restarts stopped processes), but that's the usual drill (we're always logging in on nights/weekends to kick one process or another).

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

OzzFan
Volunteer tester
Avatar
Send message
Joined: 9 Apr 02
Posts: 13625
Credit: 30,983,671
RAC: 19,884
United States
Message 886815 - Posted: 20 Apr 2009, 23:12:43 UTC

Boy did you ever start a tornado of posts in your last update!


My apologies to Jeff and Bob for not giving them more credit in the "Thank you Admins" thread in Number Crunching. While their voices are not heard, their actions are noticed even if they are mistaken to be yours Matt. Its the silent heros that know the score.
____________

Profile perryjay
Volunteer tester
Avatar
Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 15,899,386
RAC: 11,045
United States
Message 886817 - Posted: 20 Apr 2009, 23:16:42 UTC - in response to Message 886815.

I was digging for the url as OzzFan posted. I thought you would like to see it.
http://setiathome.berkeley.edu/forum_thread.php?id=53172

We weren't sure which of you got us back on track but it was really appreciated.
____________


PROUD MEMBER OF Team Starfire World BOINC

PhonAcq
Send message
Joined: 14 Apr 01
Posts: 1622
Credit: 22,356,438
RAC: 6,174
United States
Message 886830 - Posted: 21 Apr 2009, 0:23:38 UTC

I really hope that Matt, et al. don't get burned by the user group's unpolished comments on these boards. I have to slap my own fingers at times.

Let's all remember that THEY are the professionals and WE are the amateurs, despite our setifarms and quadtrillion credits. And, WE must each begin each discussion with the assumption that THEY are doing the best job they can. Afterall, over the years the system has evolved for the better despite periods of setbacks.

Now if only one of us could find ET.

Jonathan
Send message
Joined: 3 Oct 01
Posts: 1
Credit: 1,050,640
RAC: 0
United States
Message 886833 - Posted: 21 Apr 2009, 0:34:40 UTC

I did a quick forum search here and nothing came up. Has moving to another database engine such as PostgreSQL been considered? I realize this would be post anniversary if it did happen because of the no-system-changes policy.

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 886843 - Posted: 21 Apr 2009, 1:31:06 UTC - in response to Message 886830.

I really hope that Matt, et al. don't get burned by the user group's unpolished comments on these boards. I have to slap my own fingers at times.

Let's all remember that THEY are the professionals and WE are the amateurs, despite our setifarms and quadtrillion credits. And, WE must each begin each discussion with the assumption that THEY are doing the best job they can. Afterall, over the years the system has evolved for the better despite periods of setbacks.

Now if only one of us could find ET.


I agree completely.

As I see it, there are a practically infinite number of "US" and while each of us have talent and experience (in some cases, decades of experience) we also have our biases.

In other words, for every possible choice, there is very likely one camp that would advocate every option, and for each group that thinks MySQL is the best, there is another that thinks it should be something else.

... and at the end of the day, we aren't the ones doing the work. It's Bob, and Jeff, and Matt.
____________

Glenn savill
Avatar
Send message
Joined: 20 Aug 99
Posts: 2565
Credit: 3,745,629
RAC: 23,511
Australia
Message 886914 - Posted: 21 Apr 2009, 9:04:31 UTC

MMM I come to the tech news and find just apoligy's to jeff and bob !! ....... I thought this is where there is tech news about the site ?? .Or apologse to the users for a problem not fixed ,which has never happend in the 10 yrs i have been a member .I is also where setti aka bionic try's to explain what has happend when there was a system crash and how long it mite take to fix. Can we kept this part of the message to TECH NEWS ONLY please
____________

John G
Send message
Joined: 29 Dec 01
Posts: 63
Credit: 10,142,278
RAC: 0
Canada
Message 886921 - Posted: 21 Apr 2009, 10:31:42 UTC

Is there any chance anybody could look into why seti is not updating its Boinc stats.Hey and thanks to the guys who kicked the server over the weekend ---- much appreciated !!!!!

WinterKnight
Volunteer tester
Send message
Joined: 18 May 99
Posts: 8682
Credit: 24,920,148
RAC: 27,774
United Kingdom
Message 886923 - Posted: 21 Apr 2009, 11:20:00 UTC - in response to Message 886921.

Is there any chance anybody could look into why seti is not updating its Boinc stats.Hey and thanks to the guys who kicked the server over the weekend ---- much appreciated !!!!!

The Stats output is always taken from the replica database, and as Matt said it is in a confused state and has been therefore switched off.
Hence no stats until it restored.

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24483
Credit: 33,801,419
RAC: 23,299
Germany
Message 886943 - Posted: 21 Apr 2009, 13:46:52 UTC - in response to Message 886815.

Boy did you ever start a tornado of posts in your last update!


My apologies to Jeff and Bob for not giving them more credit in the "Thank you Admins" thread in Number Crunching. While their voices are not heard, their actions are noticed even if they are mistaken to be yours Matt. Its the silent heros that know the score.


Thats why i started the thread over in NC.
I know how it works on other projects.
Will try to make a donation for the admins on my birthday in june to sponsor a nice dinner for the staff.


____________

elgar
Send message
Joined: 21 May 99
Posts: 69
Credit: 2,687,478
RAC: 0
United States
Message 886976 - Posted: 21 Apr 2009, 21:08:02 UTC

Trying to bring 12 cores online as last week Matt L. said SETI is running out of computing power. Been trying since last friday to get them working, finally got WUs very briefly this morning and then SETI went offline again, so those ran out.

I'll tell you why people are bailing out of this project: the perception, right or wrong, is that SETI is unable to stay 'up'. Anyone trying to join in the last couple of months is going to be frustrated, and then they'll get asked for money. And then, they'll move on to a something else.

It shouldn't be a mystery as to why people are leaving.

HAL
Send message
Joined: 28 Mar 03
Posts: 704
Credit: 870,617
RAC: 0
United States
Message 886991 - Posted: 21 Apr 2009, 21:19:19 UTC - in response to Message 886921.

Is there any chance anybody could look into why seti is not updating its Boinc stats.Hey and thanks to the guys who kicked the server over the weekend ---- much appreciated !!!!!

To me the STATS are someting farmed out - it has nothing to do with the PROJECT.They have ALWAYS been behind reality - they run when they run and that's it. Little about STATS has changed since SETI CLASSIC. STATS catch up EVENTUALLY, and as for the OTHER GUYS I CAN ONLY SAY - blessed are the peacemakers for they shall inherit the project.They don't receive the recognition they deserve.
____________

Classic WU= 7,237 Classic Hours= 42,079

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 886994 - Posted: 21 Apr 2009, 21:21:13 UTC - in response to Message 886976.


I'll tell you why people are bailing out of this project: the perception, right or wrong, is that SETI is unable to stay 'up'. Anyone trying to join in the last couple of months is going to be frustrated, and then they'll get asked for money. And then, they'll move on to a something else.

I think the problem is the perception that SETI needs to be up, not the perception that it can't stay up.

... and I think many of the more active forum members (and certainly anyone who has built a computer just to crunch) forget that we're supposed to be harnessing a waste product.

Idle clock cycles. That's all SETI@Home really asks.

Anything beyond that is gratefully accepted.

____________

HAL
Send message
Joined: 28 Mar 03
Posts: 704
Credit: 870,617
RAC: 0
United States
Message 887068 - Posted: 21 Apr 2009, 23:26:54 UTC - in response to Message 886976.
Last modified: 21 Apr 2009, 23:28:08 UTC

Trying to bring 12 cores online as last week Matt L. said SETI is running out of computing power. Been trying since last friday to get them working, finally got WUs very briefly this morning and then SETI went offline again, so those ran out.

I'll tell you why people are bailing out of this project: the perception, right or wrong, is that SETI is unable to stay 'up'. Anyone trying to join in the last couple of months is going to be frustrated, and then they'll get asked for money. And then, they'll move on to a something else.

It shouldn't be a mystery as to why people are leaving.

UP TIME doesn't seem to be a problem with the wingmen I have observed - usually they are newbies - running an initial load of AP units or a lot of pending "other units" none of which they didn't YET get credit for. Even I shudder when I get an AP with a completion time of 97 hours and it takes 170 hours, or a AP at 107 hours and it takes 270 hours.A newbie I think should only have a cache of 2 days regardless of the number of processors and should have a credit score higher than zero in order to UPGRADE preferences. It is TRUE loss of UP TIME is bad P.R. but doesn't explain long term crunchers bailing out. My last comment might be better addressed in a political thread but considering the economies I MUST MAKE due to financial reasons, I still donated to SETI but it is something over the years I believe in
____________

Classic WU= 7,237 Classic Hours= 42,079

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 887086 - Posted: 21 Apr 2009, 23:51:26 UTC - in response to Message 887068.

A newbie I think should only have a cache of 2 days regardless of the number of processors and should have a credit score higher than zero in order to UPGRADE preferences.

If a brand-new cruncher had a quota of 1 WU/CPU/Day that would make sure that the new cruncher didn't download weeks of initial work.

When they return 1 valid work unit, their quota would double to two, and so on until it gets up to 100 work units.

The code would be simple, but I'm sure that someone would find a way to describe this as discriminatory, or worse.

____________

HAL
Send message
Joined: 28 Mar 03
Posts: 704
Credit: 870,617
RAC: 0
United States
Message 887092 - Posted: 22 Apr 2009, 0:15:16 UTC - in response to Message 887086.

Some months ago I brought on board a 1Ghz Pentium for a BURN IN and got a initial load of a MB unit with a completion time of 240 hours - Was I discriminatory about aborting it?
____________

Classic WU= 7,237 Classic Hours= 42,079

WinterKnight
Volunteer tester
Send message
Joined: 18 May 99
Posts: 8682
Credit: 24,920,148
RAC: 27,774
United Kingdom
Message 887113 - Posted: 22 Apr 2009, 1:33:37 UTC - in response to Message 887092.
Last modified: 22 Apr 2009, 1:33:55 UTC

Some months ago I brought on board a 1Ghz Pentium for a BURN IN and got a initial load of a MB unit with a completion time of 240 hours - Was I discriminatory about aborting it?

I would run a burn in test for at least 24hrs.
I know that the DCF for MB tasks, using default app is ~0.25.
I know if I used the optimised app it would probably double performance.

Therefore is it a problem if the burn in takes a little bit longer than 24 hrs?

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 887152 - Posted: 22 Apr 2009, 2:26:22 UTC - in response to Message 887092.

Some months ago I brought on board a 1Ghz Pentium for a BURN IN and got a initial load of a MB unit with a completion time of 240 hours - Was I discriminatory about aborting it?

I wouldn't say so, but it seems that no matter what you do, there is someone who will find a way to be offended.
____________

PhonAcq
Send message
Joined: 14 Apr 01
Posts: 1622
Credit: 22,356,438
RAC: 6,174
United States
Message 887423 - Posted: 22 Apr 2009, 23:52:04 UTC - in response to Message 887086.

I agree with you completely. An alternative approach would be to enable cancellations originating from the server side, perhaps if the client isn't 'living up to expectations' regarding work return rates.

I'm all for descrimination when it makes sense. Hell, I like Coke over Pepsi, and neither is better than a simple beer. So lock me up, too.

Profile perryjay
Volunteer tester
Avatar
Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 15,899,386
RAC: 11,045
United States
Message 887432 - Posted: 23 Apr 2009, 0:16:37 UTC - in response to Message 887152.


I wouldn't say so, but it seems that no matter what you do, there is someone who will find a way to be offended.


I'm offended that you would think I would be offended by what you say!! :)

____________


PROUD MEMBER OF Team Starfire World BOINC

1 · 2 · Next

Message boards : Technical News : Hinke (Apr 20 2009)

Copyright © 2014 University of California