Hinke (Apr 20 2009)

Message boards : Technical News : Hinke (Apr 20 2009)
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 886811 - Posted: 20 Apr 2009, 23:04:44 UTC

The mysql database crashed on Friday, then again on Saturday. The reasons are mysterious, though we've had similar crashes in the past - just not two in immediate succession like that. Most of the large, important tables (user, host, workunit, result) are using the innodb engine, while the many others (including team, forum preferences, posts, etc.) are using mysql's standard myisam engine. There's worry we may have lost a few rows in some of the myisam tables, though they seem to check out okay. The replica database, though, is in a confused state so we just shut it off for the time being. We're going to save any remaining cleanup for tomorrow's usual outage. As stated elsewhere, Jeff and I have adopted a policy of no-system-changes (except for emergencies) until after the anniversary. So as long as mysql continues to run well, we're not going to worry about this so much.

I know I write all these missives and therefore I get the brunt of the accolades (or otherwise) but Jeff/Bob pretty much took care of the entire mess above. I did log in on Sunday and cleaned up the server status page and the validators (which for some reason *have* to start on the command line, as opposed to the usual cron job which restarts stopped processes), but that's the usual drill (we're always logging in on nights/weekends to kick one process or another).

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 886811 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 886815 - Posted: 20 Apr 2009, 23:12:43 UTC

Boy did you ever start a tornado of posts in your last update!


My apologies to Jeff and Bob for not giving them more credit in the "Thank you Admins" thread in Number Crunching. While their voices are not heard, their actions are noticed even if they are mistaken to be yours Matt. Its the silent heros that know the score.
ID: 886815 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 886817 - Posted: 20 Apr 2009, 23:16:42 UTC - in response to Message 886815.  

I was digging for the url as OzzFan posted. I thought you would like to see it.
http://setiathome.berkeley.edu/forum_thread.php?id=53172

We weren't sure which of you got us back on track but it was really appreciated.


PROUD MEMBER OF Team Starfire World BOINC
ID: 886817 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 886830 - Posted: 21 Apr 2009, 0:23:38 UTC

I really hope that Matt, et al. don't get burned by the user group's unpolished comments on these boards. I have to slap my own fingers at times.

Let's all remember that THEY are the professionals and WE are the amateurs, despite our setifarms and quadtrillion credits. And, WE must each begin each discussion with the assumption that THEY are doing the best job they can. Afterall, over the years the system has evolved for the better despite periods of setbacks.

Now if only one of us could find ET.

ID: 886830 · Report as offensive
Jonathan

Send message
Joined: 3 Oct 01
Posts: 1
Credit: 1,050,640
RAC: 0
United States
Message 886833 - Posted: 21 Apr 2009, 0:34:40 UTC

I did a quick forum search here and nothing came up. Has moving to another database engine such as PostgreSQL been considered? I realize this would be post anniversary if it did happen because of the no-system-changes policy.
ID: 886833 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 886843 - Posted: 21 Apr 2009, 1:31:06 UTC - in response to Message 886830.  

I really hope that Matt, et al. don't get burned by the user group's unpolished comments on these boards. I have to slap my own fingers at times.

Let's all remember that THEY are the professionals and WE are the amateurs, despite our setifarms and quadtrillion credits. And, WE must each begin each discussion with the assumption that THEY are doing the best job they can. Afterall, over the years the system has evolved for the better despite periods of setbacks.

Now if only one of us could find ET.


I agree completely.

As I see it, there are a practically infinite number of "US" and while each of us have talent and experience (in some cases, decades of experience) we also have our biases.

In other words, for every possible choice, there is very likely one camp that would advocate every option, and for each group that thinks MySQL is the best, there is another that thinks it should be something else.

... and at the end of the day, we aren't the ones doing the work. It's Bob, and Jeff, and Matt.
ID: 886843 · Report as offensive
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 886914 - Posted: 21 Apr 2009, 9:04:31 UTC

MMM I come to the tech news and find just apoligy's to jeff and bob !! ....... I thought this is where there is tech news about the site ?? .Or apologse to the users for a problem not fixed ,which has never happend in the 10 yrs i have been a member .I is also where setti aka bionic try's to explain what has happend when there was a system crash and how long it mite take to fix. Can we kept this part of the message to TECH NEWS ONLY please
ID: 886914 · Report as offensive
John G

Send message
Joined: 29 Dec 01
Posts: 68
Credit: 10,932,850
RAC: 0
Canada
Message 886921 - Posted: 21 Apr 2009, 10:31:42 UTC

Is there any chance anybody could look into why seti is not updating its Boinc stats.Hey and thanks to the guys who kicked the server over the weekend ---- much appreciated !!!!!
ID: 886921 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19422
Credit: 40,757,560
RAC: 67
United Kingdom
Message 886923 - Posted: 21 Apr 2009, 11:20:00 UTC - in response to Message 886921.  

Is there any chance anybody could look into why seti is not updating its Boinc stats.Hey and thanks to the guys who kicked the server over the weekend ---- much appreciated !!!!!

The Stats output is always taken from the replica database, and as Matt said it is in a confused state and has been therefore switched off.
Hence no stats until it restored.
ID: 886923 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34398
Credit: 79,922,639
RAC: 80
Germany
Message 886943 - Posted: 21 Apr 2009, 13:46:52 UTC - in response to Message 886815.  

Boy did you ever start a tornado of posts in your last update!


My apologies to Jeff and Bob for not giving them more credit in the "Thank you Admins" thread in Number Crunching. While their voices are not heard, their actions are noticed even if they are mistaken to be yours Matt. Its the silent heros that know the score.


Thats why i started the thread over in NC.
I know how it works on other projects.
Will try to make a donation for the admins on my birthday in june to sponsor a nice dinner for the staff.




With each crime and every kindness we birth our future.
ID: 886943 · Report as offensive
elgar

Send message
Joined: 21 May 99
Posts: 69
Credit: 2,687,478
RAC: 0
United States
Message 886976 - Posted: 21 Apr 2009, 21:08:02 UTC

Trying to bring 12 cores online as last week Matt L. said SETI is running out of computing power. Been trying since last friday to get them working, finally got WUs very briefly this morning and then SETI went offline again, so those ran out.

I'll tell you why people are bailing out of this project: the perception, right or wrong, is that SETI is unable to stay 'up'. Anyone trying to join in the last couple of months is going to be frustrated, and then they'll get asked for money. And then, they'll move on to a something else.

It shouldn't be a mystery as to why people are leaving.
ID: 886976 · Report as offensive
HAL

Send message
Joined: 28 Mar 03
Posts: 704
Credit: 870,617
RAC: 0
United States
Message 886991 - Posted: 21 Apr 2009, 21:19:19 UTC - in response to Message 886921.  

Is there any chance anybody could look into why seti is not updating its Boinc stats.Hey and thanks to the guys who kicked the server over the weekend ---- much appreciated !!!!!

To me the STATS are someting farmed out - it has nothing to do with the PROJECT.They have ALWAYS been behind reality - they run when they run and that's it. Little about STATS has changed since SETI CLASSIC. STATS catch up EVENTUALLY, and as for the OTHER GUYS I CAN ONLY SAY - blessed are the peacemakers for they shall inherit the project.They don't receive the recognition they deserve.

Classic WU= 7,237 Classic Hours= 42,079
ID: 886991 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 886994 - Posted: 21 Apr 2009, 21:21:13 UTC - in response to Message 886976.  


I'll tell you why people are bailing out of this project: the perception, right or wrong, is that SETI is unable to stay 'up'. Anyone trying to join in the last couple of months is going to be frustrated, and then they'll get asked for money. And then, they'll move on to a something else.

I think the problem is the perception that SETI needs to be up, not the perception that it can't stay up.

... and I think many of the more active forum members (and certainly anyone who has built a computer just to crunch) forget that we're supposed to be harnessing a waste product.

Idle clock cycles. That's all SETI@Home really asks.

Anything beyond that is gratefully accepted.

ID: 886994 · Report as offensive
HAL

Send message
Joined: 28 Mar 03
Posts: 704
Credit: 870,617
RAC: 0
United States
Message 887068 - Posted: 21 Apr 2009, 23:26:54 UTC - in response to Message 886976.  
Last modified: 21 Apr 2009, 23:28:08 UTC

Trying to bring 12 cores online as last week Matt L. said SETI is running out of computing power. Been trying since last friday to get them working, finally got WUs very briefly this morning and then SETI went offline again, so those ran out.

I'll tell you why people are bailing out of this project: the perception, right or wrong, is that SETI is unable to stay 'up'. Anyone trying to join in the last couple of months is going to be frustrated, and then they'll get asked for money. And then, they'll move on to a something else.

It shouldn't be a mystery as to why people are leaving.

UP TIME doesn't seem to be a problem with the wingmen I have observed - usually they are newbies - running an initial load of AP units or a lot of pending "other units" none of which they didn't YET get credit for. Even I shudder when I get an AP with a completion time of 97 hours and it takes 170 hours, or a AP at 107 hours and it takes 270 hours.A newbie I think should only have a cache of 2 days regardless of the number of processors and should have a credit score higher than zero in order to UPGRADE preferences. It is TRUE loss of UP TIME is bad P.R. but doesn't explain long term crunchers bailing out. My last comment might be better addressed in a political thread but considering the economies I MUST MAKE due to financial reasons, I still donated to SETI but it is something over the years I believe in

Classic WU= 7,237 Classic Hours= 42,079
ID: 887068 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 887086 - Posted: 21 Apr 2009, 23:51:26 UTC - in response to Message 887068.  

A newbie I think should only have a cache of 2 days regardless of the number of processors and should have a credit score higher than zero in order to UPGRADE preferences.

If a brand-new cruncher had a quota of 1 WU/CPU/Day that would make sure that the new cruncher didn't download weeks of initial work.

When they return 1 valid work unit, their quota would double to two, and so on until it gets up to 100 work units.

The code would be simple, but I'm sure that someone would find a way to describe this as discriminatory, or worse.

ID: 887086 · Report as offensive
HAL

Send message
Joined: 28 Mar 03
Posts: 704
Credit: 870,617
RAC: 0
United States
Message 887092 - Posted: 22 Apr 2009, 0:15:16 UTC - in response to Message 887086.  

Some months ago I brought on board a 1Ghz Pentium for a BURN IN and got a initial load of a MB unit with a completion time of 240 hours - Was I discriminatory about aborting it?

Classic WU= 7,237 Classic Hours= 42,079
ID: 887092 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19422
Credit: 40,757,560
RAC: 67
United Kingdom
Message 887113 - Posted: 22 Apr 2009, 1:33:37 UTC - in response to Message 887092.  
Last modified: 22 Apr 2009, 1:33:55 UTC

Some months ago I brought on board a 1Ghz Pentium for a BURN IN and got a initial load of a MB unit with a completion time of 240 hours - Was I discriminatory about aborting it?

I would run a burn in test for at least 24hrs.
I know that the DCF for MB tasks, using default app is ~0.25.
I know if I used the optimised app it would probably double performance.

Therefore is it a problem if the burn in takes a little bit longer than 24 hrs?
ID: 887113 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 887152 - Posted: 22 Apr 2009, 2:26:22 UTC - in response to Message 887092.  

Some months ago I brought on board a 1Ghz Pentium for a BURN IN and got a initial load of a MB unit with a completion time of 240 hours - Was I discriminatory about aborting it?

I wouldn't say so, but it seems that no matter what you do, there is someone who will find a way to be offended.
ID: 887152 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 887423 - Posted: 22 Apr 2009, 23:52:04 UTC - in response to Message 887086.  

I agree with you completely. An alternative approach would be to enable cancellations originating from the server side, perhaps if the client isn't 'living up to expectations' regarding work return rates.

I'm all for descrimination when it makes sense. Hell, I like Coke over Pepsi, and neither is better than a simple beer. So lock me up, too.
ID: 887423 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 887432 - Posted: 23 Apr 2009, 0:16:37 UTC - in response to Message 887152.  


I wouldn't say so, but it seems that no matter what you do, there is someone who will find a way to be offended.


I'm offended that you would think I would be offended by what you say!! :)



PROUD MEMBER OF Team Starfire World BOINC
ID: 887432 · Report as offensive
1 · 2 · Next

Message boards : Technical News : Hinke (Apr 20 2009)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.