Long Outage (Jun 23 2009)

Message boards : Technical News : Long Outage (Jun 23 2009)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 910566 - Posted: 23 Jun 2009, 23:09:29 UTC

Usual outage today (which happens every Tuesday for mysql database compression/backup). It went really long - I guess we've been busy inserting/deleting all last week. We went back to an older policy of doing simultaneous compression on both the master and replica, which should vastly speed up post-outage recovery. Until today we've been letting the compression commands (i.e. "alter table user type = innodb") to pass from the master to replica via the usual channels, but they wouldn't happen in parallel (as the loooong queries had to complete successfully on the master before the replica would start processing them). This caused the replica to be as many as four hours behind when the project started up again in the afternoon. The benefit of doing it that way was less work/management and accidental updates/inserts during the outage wouldn't get lost. Going back to doing it in parallel, we have to stop the replica before we start and reset the master after we're done, thus increasing the chance of these lost queries, but so far we've had 0 such incidents during these weekly outages since we started using mysql years ago.

A weekly planned outage is usually a good time to take care of some offline chores. Today I cleaned up lots of unnecessary mounts in a effort to reduce our automounter maps as much as possible (so we don't have such a tangled web which can be quite painful when one server disappears). I also made vader the sole download server, thus freeing bane to be whatever we want - which will be useful to handle certain services temporarily as we go around upgrading the out-of-date operating systems on lots of these machines. I think vader can handle the load alone.

I hear the presentations from the 10th anniversary celebration have all been converted to mpegs. It's a few gigs worth of stuff on a computer down on campus. A flash drive containing all that will appear up here at our lab sometime in the near future. Or it may be hosted on an interim server. We shall see.

- Matt

-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 910566 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30591
Credit: 53,134,872
RAC: 32
United States
Message 910571 - Posted: 23 Jun 2009, 23:16:46 UTC - in response to Message 910566.  

Thanks for the update.

Hope vader is up to it.

Hey, wow, tag buttons above!

ID: 910571 · Report as offensive
Profile Dr. C.E.T.I.
Avatar

Send message
Joined: 29 Feb 00
Posts: 16019
Credit: 794,685
RAC: 0
United States
Message 910589 - Posted: 23 Jun 2009, 23:34:49 UTC

. . . Thanks for the Updates Matt [and the Forum BBCode as well]

> re-posted your comment re: the Videos - in the SETI Cafe - ustream.tv SETI Anniversary webcast

Thanks for that Info as well


BOINC Wiki . . .

Science Status Page . . .
ID: 910589 · Report as offensive
Profile Jack Zhang
Volunteer tester
Avatar

Send message
Joined: 2 Jul 06
Posts: 206
Credit: 6,142,449
RAC: 0
Canada
Message 910638 - Posted: 24 Jun 2009, 2:55:44 UTC - in response to Message 910572.  

The DB purge is now running, but the scheduler process is still not running. Maybe we're waiting for the splitters to pick up the slack before work is sent out.
What if Fiction was Fact and Fact was Fiction and vice versa?
ID: 910638 · Report as offensive
zpm
Volunteer tester
Avatar

Send message
Joined: 25 Apr 08
Posts: 284
Credit: 1,659,024
RAC: 0
United States
Message 910653 - Posted: 24 Jun 2009, 4:17:03 UTC - in response to Message 910638.  

glad to see that the backend of boinc webpage was update too....

I recommend Secunia PSI: http://secunia.com/vulnerability_scanning/personal/
Go Georgia Tech.
ID: 910653 · Report as offensive
mtlmrgn

Send message
Joined: 15 Mar 06
Posts: 1
Credit: 190,346
RAC: 0
United States
Message 910822 - Posted: 24 Jun 2009, 19:31:16 UTC - in response to Message 910566.  

june 24th not getting any downloads today as you can see it
is not my connection.
ID: 910822 · Report as offensive
Gorim1

Send message
Joined: 15 Nov 06
Posts: 4
Credit: 1,536,081
RAC: 0
Poland
Message 910836 - Posted: 24 Jun 2009, 19:56:38 UTC

Bruce i have the same problem...
Im running out of work to do
Do with it something guys ....
ID: 910836 · Report as offensive
Profile Andy Lee Robinson
Avatar

Send message
Joined: 8 Dec 05
Posts: 630
Credit: 59,973,836
RAC: 0
Hungary
Message 910865 - Posted: 24 Jun 2009, 20:49:04 UTC - in response to Message 910566.  
Last modified: 24 Jun 2009, 20:50:47 UTC

Matt, just stop slave on the replica and remove alter privileges for the replication account temporarily.
Do the alter command on the slave locally as root, then start slave when finished and it'll carry on from the last position, and don't have to wait for the master to finish.
The alter command from the master will be ignored on the slave (or can be made to), or if it causes replication to stop, then "set global sql_slave_skip_counter=1; start slave;" to skip over it and continue.
Once the slave has read past the alter command, just reset privs again for the repl account on the slave (before any other needed alter commands come along!)
This way, you shouldn't lose any queries at all, nor have to make notes of master pointers.
Entire thing could be done as a script "tuesday_backup.sh" on a cron task!
ID: 910865 · Report as offensive
Profile i_mcintosh

Send message
Joined: 30 May 99
Posts: 3
Credit: 1,038,479
RAC: 0
United Kingdom
Message 911159 - Posted: 25 Jun 2009, 10:58:29 UTC - in response to Message 910822.  

Mid-day (25th) here in the UK, still no downloads here either.

I'm out of work :(

Maybe the ETs are blocking it????
ID: 911159 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 446,358
RAC: 0
Germany
Message 911162 - Posted: 25 Jun 2009, 11:04:39 UTC - in response to Message 911159.  

Maybe the ETs are blocking it????

Or maybe your ISP or DNS server? :-)
ID: 911162 · Report as offensive
Profile Virtual Boss*
Volunteer tester
Avatar

Send message
Joined: 4 May 08
Posts: 417
Credit: 6,440,287
RAC: 0
Australia
Message 911166 - Posted: 25 Jun 2009, 11:28:44 UTC - in response to Message 911159.  

Mid-day (25th) here in the UK, still no downloads here either.

I'm out of work :(

Maybe the ETs are blocking it????



If you have task assigned, but are not downloading, try stopping/restarting boinc.
Worked for my rigs.
Flying high with Team Sicituradastra.
ID: 911166 · Report as offensive
Profile i_mcintosh

Send message
Joined: 30 May 99
Posts: 3
Credit: 1,038,479
RAC: 0
United Kingdom
Message 911174 - Posted: 25 Jun 2009, 11:51:37 UTC - in response to Message 911166.  

Tried restarting the PCs entirely...

25/06/2009 12:50:01 Internet access OK - project servers may be temporarily down.
25/06/2009 12:50:22 Project communication failed: attempting access to reference site
25/06/2009 12:50:22 SETI@home Temporarily failed download of ap_graphics_5.05_windows_intelx86.exe: connect() failed
25/06/2009 12:50:22 SETI@home Backing off 1 min 0 sec on download of ap_graphics_5.05_windows_intelx86.exe
25/06/2009 12:50:22 SETI@home [error] File ap405.jpg has wrong size: expected 7653, got 0
25/06/2009 12:50:22 SETI@home Started download of ap405.jpg
25/06/2009 12:50:23 Internet access OK - project servers may be temporarily down.

ID: 911174 · Report as offensive
Profile Geek@Play
Volunteer tester
Avatar

Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 911185 - Posted: 25 Jun 2009, 12:20:21 UTC - in response to Message 911174.  
Last modified: 25 Jun 2009, 12:21:25 UTC

Tried restarting the PCs entirely...

25/06/2009 12:50:01 Internet access OK - project servers may be temporarily down.
25/06/2009 12:50:22 Project communication failed: attempting access to reference site
25/06/2009 12:50:22 SETI@home Temporarily failed download of ap_graphics_5.05_windows_intelx86.exe: connect() failed
25/06/2009 12:50:22 SETI@home Backing off 1 min 0 sec on download of ap_graphics_5.05_windows_intelx86.exe
25/06/2009 12:50:22 SETI@home [error] File ap405.jpg has wrong size: expected 7653, got 0
25/06/2009 12:50:22 SETI@home Started download of ap405.jpg
25/06/2009 12:50:23 Internet access OK - project servers may be temporarily down.


On your settings..........under Computing Preferences the last setting states....

Skip image file verification?
Check this ONLY if your Internet provider modifies image files (UMTS does this, for example).
Skipping verification reduces the security of BOINC.


You should set this to yes. You need to skip verification to clear the graphics faults you have.

A restart of the computer should clear the rest.
Boinc....Boinc....Boinc....Boinc....
ID: 911185 · Report as offensive
Profile i_mcintosh

Send message
Joined: 30 May 99
Posts: 3
Credit: 1,038,479
RAC: 0
United Kingdom
Message 911226 - Posted: 25 Jun 2009, 13:59:53 UTC - in response to Message 911185.  

Brilliant! Thanks tons :)




On your settings..........under Computing Preferences the last setting states....

Skip image file verification?
Check this ONLY if your Internet provider modifies image files (UMTS does this, for example).
Skipping verification reduces the security of BOINC.


You should set this to yes. You need to skip verification to clear the graphics faults you have.

A restart of the computer should clear the rest.


ID: 911226 · Report as offensive
C

Send message
Joined: 3 Apr 99
Posts: 240
Credit: 7,716,977
RAC: 0
United States
Message 911235 - Posted: 25 Jun 2009, 14:17:05 UTC - in response to Message 911166.  

Mid-day (25th) here in the UK, still no downloads here either.

I'm out of work :(

Maybe the ETs are blocking it????



If you have task assigned, but are not downloading, try stopping/restarting boinc.
Worked for my rigs.


Thanks, Boss - that worked for my machines.

C

Join Team MacNN
ID: 911235 · Report as offensive

Message boards : Technical News : Long Outage (Jun 23 2009)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.