The Indestructible Drop (Nov 02 2011)

Message boards : Technical News : The Indestructible Drop (Nov 02 2011)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 1167205 - Posted: 2 Nov 2011, 17:26:55 UTC

Usually when I take large chunks of time away from the lab the servers get sad and Jeff has to deal with some extra sysadmin chaos beyond the usual grind we both deal with day to day. This time they kindly waited for me to get back.

The BOINC web/alpha project server went kaput on Monday, the day I returned. This wasn't the worst tragedy, as all I had to do was reinstall the OS. However during that process one of the non-OS filesystems in the machine got screwed up. Classic linux software RAID behavior: failing for no apparent reason, and instead of recovering gracefully like it should it gets into a funny state that makes recovery impossible. As I griped about in the past: linux software RAID is really only good for organized, faster storage, with the side benefit of maybe, on very rare occasions, actually protecting the data. The cons for linux software RAID is that it will eventually go bonkers and ruin your day.

Fine. I rebuilt the RAID, and we have daily backups of the filesystem in question (which happens to hold the entire BOINC web site). Well, turns out the lab-wide backup system was also having problems. Talk about bad timing. Long story short we had to wait 48 hours for the backup system to get back on line, and only just now am I recovering the web site. It should be back later this afternoon in some form or another (I hope).

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1167205 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1167215 - Posted: 2 Nov 2011, 18:15:45 UTC - in response to Message 1167205.  
Last modified: 2 Nov 2011, 18:22:19 UTC

Thanks for the update Matt,

Do have a time scale for the next scheduler rebuild? Until a properly fixed scheduler is deployed Anonymous platform hosts without flops values get the same <rsc_fpops_est> values for CPU and GPU apps, and so DCF goes all over the place,
(Richard Haselgrove asked on Boinc_Dev: 'Not being a skilled code reader, may I ask if changeset 24385's "improve the accuracy of FLOPS estimation for GPU apps" applies to anonymous platform?', But he never got an answer from DA.)

Claggy
ID: 1167215 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1167216 - Posted: 2 Nov 2011, 18:18:45 UTC

Matt to the rescue....
Glad you're back to make the servers happy again.

Meow!
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1167216 · Report as offensive
QSilver

Send message
Joined: 26 May 99
Posts: 232
Credit: 6,452,764
RAC: 0
United States
Message 1167222 - Posted: 2 Nov 2011, 18:50:08 UTC

Thanks for the update, Matt.
ID: 1167222 · Report as offensive
Profile Italy98
Avatar

Send message
Joined: 15 May 99
Posts: 27
Credit: 394,994
RAC: 0
United States
Message 1167277 - Posted: 2 Nov 2011, 21:20:04 UTC

I believe there are still gremlins running amok in the servers. I tried to log onto both BOINC and SETI and had to revert to using the authenticator string for SETI. BOINC is not recognizing either my regular login info (No account with email address xxxxx@xxxxx.net exists. Please go back and try again.) nor the authenticator string. On the BOINC download page (http://boinc.berkeley.edu/download.php) the following is displayed - Download BOINC, 6.12.34 for Windows 64-bit (0.00 MB).
ID: 1167277 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 446,358
RAC: 0
Germany
Message 1167335 - Posted: 3 Nov 2011, 0:17:24 UTC - in response to Message 1167277.  
Last modified: 3 Nov 2011, 0:22:21 UTC

BOINC is not recognizing either my regular login info (No account with email address xxxxx@xxxxx.net exists. Please go back and try again.) nor the authenticator string.

Why should BOINC accept the SETI@home authenticator? (There's no such thing as a BOINC project and so no BOINC authenticator either, at least not in an account*.xml file)

Did you have an account at the BOINC dev message boards previous to the outage?

Gruß,
Gundolf
ID: 1167335 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1167341 - Posted: 3 Nov 2011, 1:03:21 UTC - in response to Message 1167205.  

Thanks for the information, Matt.

You complain about this RAID setup fairly frequently. Is there nothing we can help with...like a new really shiny new hardware RAID controller?
ID: 1167341 · Report as offensive
Profile Italy98
Avatar

Send message
Joined: 15 May 99
Posts: 27
Credit: 394,994
RAC: 0
United States
Message 1167450 - Posted: 3 Nov 2011, 11:45:32 UTC - in response to Message 1167335.  

Hello Gundolf Jahn,

Thank you for your response, and yes, you are correct about BOINC. I am in the process of (bouncing off the walls) trying to setup our first home network, and while trying to download the current BOINC software mistakenly tried to log in to the BOINC message boards after the download page displayed the link size as 0.00 MB. Again, danke!

BOINC is not recognizing either my regular login info (No account with email address xxxxx@xxxxx.net exists. Please go back and try again.) nor the authenticator string.

Why should BOINC accept the SETI@home authenticator? (There's no such thing as a BOINC project and so no BOINC authenticator either, at least not in an account*.xml file)

Did you have an account at the BOINC dev message boards previous to the outage?

Gruß,
Gundolf


ID: 1167450 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 1168220 - Posted: 5 Nov 2011, 14:26:28 UTC

Hi, sorry to add to your woes, but uploads are very slow, and the Scarecrow graphs have flatlined since 2300 UTC yesterday (4 Nov, 2011)
.

Hello, from Albany, CA!...
ID: 1168220 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14674
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1168222 - Posted: 5 Nov 2011, 14:37:00 UTC - in response to Message 1168220.  

Hi, sorry to add to your woes, but uploads are very slow, and the Scarecrow graphs have flatlined since 2300 UTC yesterday (4 Nov, 2011)

Scarecrow's graphs have flatlined because the souce data (the XML version of the server status page) hasn't updated since 4 Nov 2011 | 21:20:07 UTC - the staff are aware of this, but it's the weekend and still very early morning in Berkeley.

Refer to the discussion threads in Number Crunching.
ID: 1168222 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 1168633 - Posted: 6 Nov 2011, 14:10:31 UTC - in response to Message 1168222.  
Last modified: 6 Nov 2011, 14:11:21 UTC

Hi, sorry to add to your woes, but uploads are very slow, and the Scarecrow graphs have flatlined since 2300 UTC yesterday (4 Nov, 2011)

Scarecrow's graphs have flatlined because the source data (the XML version of the server status page) hasn't updated since 4 Nov 2011 | 21:20:07 UTC - the staff are aware of this, but it's the weekend and still very early morning in Berkeley.

Refer to the discussion threads in Number Crunching.


I know the time in Berkeley - I live 15 minutes by car north of the SSL - I was posting that so that someone could look into it when they arrived.

BTW, I haven't gotten anything to download since Friday...
.

Hello, from Albany, CA!...
ID: 1168633 · Report as offensive

Message boards : Technical News : The Indestructible Drop (Nov 02 2011)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.