The Indestructible Drop (Nov 02 2011)


log in

Advanced search

Message boards : Technical News : The Indestructible Drop (Nov 02 2011)

Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1389
Credit: 74,079
RAC: 0
United States
Message 1167205 - Posted: 2 Nov 2011, 17:26:55 UTC

Usually when I take large chunks of time away from the lab the servers get sad and Jeff has to deal with some extra sysadmin chaos beyond the usual grind we both deal with day to day. This time they kindly waited for me to get back.

The BOINC web/alpha project server went kaput on Monday, the day I returned. This wasn't the worst tragedy, as all I had to do was reinstall the OS. However during that process one of the non-OS filesystems in the machine got screwed up. Classic linux software RAID behavior: failing for no apparent reason, and instead of recovering gracefully like it should it gets into a funny state that makes recovery impossible. As I griped about in the past: linux software RAID is really only good for organized, faster storage, with the side benefit of maybe, on very rare occasions, actually protecting the data. The cons for linux software RAID is that it will eventually go bonkers and ruin your day.

Fine. I rebuilt the RAID, and we have daily backups of the filesystem in question (which happens to hold the entire BOINC web site). Well, turns out the lab-wide backup system was also having problems. Talk about bad timing. Long story short we had to wait 48 hours for the backup system to get back on line, and only just now am I recovering the web site. It should be back later this afternoon in some form or another (I hope).

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4101
Credit: 33,140,679
RAC: 8,337
United Kingdom
Message 1167215 - Posted: 2 Nov 2011, 18:15:45 UTC - in response to Message 1167205.
Last modified: 2 Nov 2011, 18:22:19 UTC

Thanks for the update Matt,

Do have a time scale for the next scheduler rebuild? Until a properly fixed scheduler is deployed Anonymous platform hosts without flops values get the same <rsc_fpops_est> values for CPU and GPU apps, and so DCF goes all over the place,
(Richard Haselgrove asked on Boinc_Dev: 'Not being a skilled code reader, may I ask if changeset 24385's "improve the accuracy of FLOPS estimation for GPU apps" applies to anonymous platform?', But he never got an answer from DA.)

Claggy

QSilver
Send message
Joined: 26 May 99
Posts: 229
Credit: 4,674,865
RAC: 2,987
United States
Message 1167222 - Posted: 2 Nov 2011, 18:50:08 UTC

Thanks for the update, Matt.

Profile Italy98
Avatar
Send message
Joined: 15 May 99
Posts: 27
Credit: 394,994
RAC: 0
United States
Message 1167277 - Posted: 2 Nov 2011, 21:20:04 UTC

I believe there are still gremlins running amok in the servers. I tried to log onto both BOINC and SETI and had to revert to using the authenticator string for SETI. BOINC is not recognizing either my regular login info (No account with email address xxxxx@xxxxx.net exists. Please go back and try again.) nor the authenticator string. On the BOINC download page (http://boinc.berkeley.edu/download.php) the following is displayed - Download BOINC, 6.12.34 for Windows 64-bit (0.00 MB).
____________

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 358,718
RAC: 27
Germany
Message 1167335 - Posted: 3 Nov 2011, 0:17:24 UTC - in response to Message 1167277.
Last modified: 3 Nov 2011, 0:22:21 UTC

BOINC is not recognizing either my regular login info (No account with email address xxxxx@xxxxx.net exists. Please go back and try again.) nor the authenticator string.

Why should BOINC accept the SETI@home authenticator? (There's no such thing as a BOINC project and so no BOINC authenticator either, at least not in an account*.xml file)

Did you have an account at the BOINC dev message boards previous to the outage?

Gruß,
Gundolf

tbretProject donor
Volunteer tester
Avatar
Send message
Joined: 28 May 99
Posts: 2807
Credit: 211,531,540
RAC: 174,842
United States
Message 1167341 - Posted: 3 Nov 2011, 1:03:21 UTC - in response to Message 1167205.

Thanks for the information, Matt.

You complain about this RAID setup fairly frequently. Is there nothing we can help with...like a new really shiny new hardware RAID controller?

Profile Italy98
Avatar
Send message
Joined: 15 May 99
Posts: 27
Credit: 394,994
RAC: 0
United States
Message 1167450 - Posted: 3 Nov 2011, 11:45:32 UTC - in response to Message 1167335.

Hello Gundolf Jahn,

Thank you for your response, and yes, you are correct about BOINC. I am in the process of (bouncing off the walls) trying to setup our first home network, and while trying to download the current BOINC software mistakenly tried to log in to the BOINC message boards after the download page displayed the link size as 0.00 MB. Again, danke!

BOINC is not recognizing either my regular login info (No account with email address xxxxx@xxxxx.net exists. Please go back and try again.) nor the authenticator string.

Why should BOINC accept the SETI@home authenticator? (There's no such thing as a BOINC project and so no BOINC authenticator either, at least not in an account*.xml file)

Did you have an account at the BOINC dev message boards previous to the outage?

Gruß,
Gundolf


____________

Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar
Send message
Joined: 20 Dec 05
Posts: 1948
Credit: 10,280,678
RAC: 17,834
United States
Message 1168220 - Posted: 5 Nov 2011, 14:26:28 UTC

Hi, sorry to add to your woes, but uploads are very slow, and the Scarecrow graphs have flatlined since 2300 UTC yesterday (4 Nov, 2011)
____________
.

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8550
Credit: 50,384,752
RAC: 50,678
United Kingdom
Message 1168222 - Posted: 5 Nov 2011, 14:37:00 UTC - in response to Message 1168220.

Hi, sorry to add to your woes, but uploads are very slow, and the Scarecrow graphs have flatlined since 2300 UTC yesterday (4 Nov, 2011)

Scarecrow's graphs have flatlined because the souce data (the XML version of the server status page) hasn't updated since 4 Nov 2011 | 21:20:07 UTC - the staff are aware of this, but it's the weekend and still very early morning in Berkeley.

Refer to the discussion threads in Number Crunching.

Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar
Send message
Joined: 20 Dec 05
Posts: 1948
Credit: 10,280,678
RAC: 17,834
United States
Message 1168633 - Posted: 6 Nov 2011, 14:10:31 UTC - in response to Message 1168222.
Last modified: 6 Nov 2011, 14:11:21 UTC

Hi, sorry to add to your woes, but uploads are very slow, and the Scarecrow graphs have flatlined since 2300 UTC yesterday (4 Nov, 2011)

Scarecrow's graphs have flatlined because the source data (the XML version of the server status page) hasn't updated since 4 Nov 2011 | 21:20:07 UTC - the staff are aware of this, but it's the weekend and still very early morning in Berkeley.

Refer to the discussion threads in Number Crunching.


I know the time in Berkeley - I live 15 minutes by car north of the SSL - I was posting that so that someone could look into it when they arrived.

BTW, I haven't gotten anything to download since Friday...
____________
.

Message boards : Technical News : The Indestructible Drop (Nov 02 2011)

Copyright © 2014 University of California