Possibly no MB work for a week!!!

Message boards : Number crunching : Possibly no MB work for a week!!!
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 878914 - Posted: 24 Mar 2009, 21:17:05 UTC
Last modified: 24 Mar 2009, 21:21:21 UTC

See Matt's post HERE

Bernie
ID: 878914 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 878921 - Posted: 24 Mar 2009, 21:24:21 UTC

He did also say that AP work would likely be able to continue, so that's the solution for those who run out of MBs for CPU and CUDA. Some of you may not like running AP, but if that's all there is, there's only two choices: crunch AP or use a back-up project.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 878921 · Report as offensive
Profile [KWSN]John Galt 007
Volunteer tester
Avatar

Send message
Joined: 9 Nov 99
Posts: 2444
Credit: 25,086,197
RAC: 0
United States
Message 878923 - Posted: 24 Mar 2009, 21:36:38 UTC

Looks like SETI Beta has plenty to do...132k WUs at the moment...
Clk2HlpSetiCty:::PayIt4ward

ID: 878923 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 878928 - Posted: 24 Mar 2009, 21:49:50 UTC

I have a couple of back-up projects if I have to fall back on them.

ID: 878928 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 878931 - Posted: 24 Mar 2009, 21:53:08 UTC

To help alleviate panic, here's the gist of our general plan:

1. get thumper back up and running with a three-way root mirror. If all goes well, this will be done enough sometime tomorrow (Wednesday), i.e. we'll have a two-way root mirror and let the third one sync up in the background while we bring the system up, then during next week's outage do more drive swapping to install grub/finish the resync on this third drive.

Splitting/assimilating will be completely off for all projects until thumper is back up.

2. as soon as thumper is back up (tomorrow?) we can turn splitting/assimilating on for AP and get to work on the pulse table reconfiguration (which we can only do if the system/database is up). The plan (in simplest terms) is: create new database chunks, copy the current pulse table to these new chunks, then drop the old table and rename the new one. We estimate at least 24 hours for that.

So if we time things right we may be fully functional before the end of thursday, maybe friday. However considering the lost time this morning and the usual unexpected hurdles that crop up.. that's why I give it a week if only to keep expectations realistic, yet leave room for potential pleasant surprises.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 878931 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 878933 - Posted: 24 Mar 2009, 21:58:18 UTC - in response to Message 878914.  

See Matt's post HERE

Bernie

Bummer Dude, Looks like the well is dry at the moment, But then We're going through a Drought here, Rain will come one day soon, Until then I'll wait.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 878933 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 878958 - Posted: 24 Mar 2009, 23:01:55 UTC - in response to Message 878948.  

Would reissues of AP 5.00 be affected?

Reissues for wu-errors or missed deadlines will continue as normal, so there will be a small trickle of work available.

"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
ID: 878958 · Report as offensive
Profile SATAN
Avatar

Send message
Joined: 27 Aug 06
Posts: 835
Credit: 2,129,006
RAC: 0
United Kingdom
Message 878967 - Posted: 24 Mar 2009, 23:27:31 UTC

So if it takes a week to get Thumper and everything else back up to speed, it'll be two before there is a steady work flow of units once again. Network traffic will be hell the moment it's brought back on line.

No biggie, just have to find something else to crunch.
ID: 878967 · Report as offensive
Profile Borgholio
Avatar

Send message
Joined: 2 Aug 99
Posts: 654
Credit: 18,623,738
RAC: 45
United States
Message 878997 - Posted: 25 Mar 2009, 1:49:24 UTC

Well even though it's a dev version, I installed 6.6.17 on both of my CUDA machines and it's working well. Unlike 6.4.5, it is capable of running GPU Grid and Seti@home at the same time without scheduling issues. So if your GPU goes cold during this outage, subscribe to GPU Grid as a backup project.
You will be assimilated...bunghole!

ID: 878997 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 878999 - Posted: 25 Mar 2009, 1:52:40 UTC - in response to Message 878997.  

Well even though it's a dev version, I installed 6.6.17 on both of my CUDA machines and it's working well. Unlike 6.4.5, it is capable of running GPU Grid and Seti@home at the same time without scheduling issues. So if your GPU goes cold during this outage, subscribe to GPU Grid as a backup project.

I wish I could right now, But It will be several months before I can upgrade to a gpu that can crunch as I don't think the old 7800GTX will do the job.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 878999 · Report as offensive
Profile Voyager
Volunteer tester
Avatar

Send message
Joined: 2 Nov 99
Posts: 602
Credit: 3,264,813
RAC: 0
United States
Message 879033 - Posted: 25 Mar 2009, 5:59:42 UTC

Set to nnt and will use the time to do all thats necessary to run 6.6.17 , then maybe gpu grid.
ID: 879033 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 879034 - Posted: 25 Mar 2009, 6:10:10 UTC - in response to Message 879033.  

Set to nnt and will use the time to do all thats necessary to run 6.6.17, then maybe gpu grid.

I earlier set Boinc to nnt also, Gpugrid I can't do.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 879034 · Report as offensive
Chelski
Avatar

Send message
Joined: 3 Jan 00
Posts: 121
Credit: 8,979,050
RAC: 0
Malaysia
Message 879044 - Posted: 25 Mar 2009, 8:25:49 UTC

Well, if it is going to be that long a drought, time to schedule some periodic maintenance on the CUDA cruncher. At least the silent majority who seems to be mostly on AP will not notice it.


ID: 879044 · Report as offensive
Profile KWSN Ekky Ekky Ekky
Avatar

Send message
Joined: 25 May 99
Posts: 944
Credit: 52,956,491
RAC: 67
United Kingdom
Message 879056 - Posted: 25 Mar 2009, 11:12:13 UTC

Interestingly, very occasional tasks do seem to be leaking out. One arrived here at 8.43 UTC this morning.

ID: 879056 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 879061 - Posted: 25 Mar 2009, 12:09:36 UTC

I have just had 3 MBs all resends, with all 3 it seems that one of the wingmen doing them has detached. Do not know if it is the same person.
ID: 879061 · Report as offensive
Hans Kramer
Volunteer tester

Send message
Joined: 16 May 99
Posts: 61
Credit: 8,770,184
RAC: 0
Netherlands
Message 879064 - Posted: 25 Mar 2009, 12:12:00 UTC - in response to Message 879056.  

This is not a newly split WU but one that's been resend to you. This will occasionally happen.

ID: 879064 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 879155 - Posted: 25 Mar 2009, 19:01:31 UTC

I guess there's one good thing out of this. It turns out that if there are no AP_v5's to send, computers that have the right settings in the venue will actually get MBs afterall, but the AP_v5 bucket has to be empty. :p
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 879155 · Report as offensive
Profile Gonad the Destroyer®©™
Avatar

Send message
Joined: 6 Aug 99
Posts: 204
Credit: 12,463,705
RAC: 0
United States
Message 879216 - Posted: 25 Mar 2009, 21:50:49 UTC

Nothing wrong with AP's.....

Sure it takes a tad longer, but the credits more then make up for it.....
ID: 879216 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 879232 - Posted: 25 Mar 2009, 22:45:53 UTC - in response to Message 879216.  

Nothing wrong with AP's.....

Sure it takes a tad longer, but the credits more then make up for it.....

No, there isn't anything wrong with APs, I was just mentioning that from what I have seen, if AP_v5 is selected, that's all you get. Then I noticed that if the AP bucket is empty, it is in fact possible to get MB, but the AP bucket has to be empty.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 879232 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 879347 - Posted: 26 Mar 2009, 8:01:48 UTC

After a quick glance at the cricket graph, it looks as if around 7pm PDT (utc-7) work of some kind started flowing. Looking at the server status page, it looks as if the crew did get things working well enough to get AP rolling again.

Good job guys!
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 879347 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Possibly no MB work for a week!!!


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.