Extended Outage August 3 2010 Problems

Message boards : Number crunching : Extended Outage August 3 2010 Problems
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile Pappa
Volunteer tester
Avatar

Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 1022390 - Posted: 4 Aug 2010, 3:49:48 UTC

Here is the next version

Regards

Please consider a Donation to the Seti Project.

ID: 1022390 · Report as offensive
Profile ex_brit
Avatar

Send message
Joined: 14 Feb 04
Posts: 182
Credit: 431,839
RAC: 0
Canada
Message 1022482 - Posted: 4 Aug 2010, 13:45:36 UTC

Pappa, Is this extended outage a permanent feature now? It's all very well, meanwhile peoples' machines are storing up thousands of work units ready to be sent and Boinc keeps trying fruitlessly of course every so often.
Peter.
Toronto, Canada

ID: 1022482 · Report as offensive
Profile evilspoons
Avatar

Send message
Joined: 30 Jul 99
Posts: 50
Credit: 8,469,307
RAC: 4
Canada
Message 1022536 - Posted: 4 Aug 2010, 17:20:40 UTC

I'm assuming this is why my PC has a zillion work units that say "uploading" and aren't going anywhere and I've run out of Seti WUs? Looks like I'll be entirely Rosetta for the time being...
ID: 1022536 · Report as offensive
Profile ex_brit
Avatar

Send message
Joined: 14 Feb 04
Posts: 182
Credit: 431,839
RAC: 0
Canada
Message 1022539 - Posted: 4 Aug 2010, 17:29:47 UTC - in response to Message 1022536.  

Exactly. Why the whole shebang has be taken down for several days every week is beyond me but I suppose they must have good reason.
Peter.
Toronto, Canada

ID: 1022539 · Report as offensive
Profile Blurf
Volunteer tester

Send message
Joined: 2 Sep 06
Posts: 8962
Credit: 12,678,685
RAC: 0
United States
Message 1022542 - Posted: 4 Aug 2010, 17:55:13 UTC

Ex_brit,

I would recommend reading this thread for info on the outages and why they are occuring.


ID: 1022542 · Report as offensive
Profile ex_brit
Avatar

Send message
Joined: 14 Feb 04
Posts: 182
Credit: 431,839
RAC: 0
Canada
Message 1022543 - Posted: 4 Aug 2010, 18:09:46 UTC - in response to Message 1022542.  

Thanks I just started reading it actually. Wonder what Pappa means by 'Here is the next version'.
Peter.
Toronto, Canada

ID: 1022543 · Report as offensive
Profile Bill Walker
Avatar

Send message
Joined: 4 Sep 99
Posts: 3868
Credit: 2,697,267
RAC: 0
Canada
Message 1022545 - Posted: 4 Aug 2010, 18:11:49 UTC - in response to Message 1022543.  

Wonder what Pappa means by 'Here is the next version'.


He starts one of these threads after each 3 day outage. The theory is that this will help us tell if things got better, or worse, or just different.

ID: 1022545 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1022547 - Posted: 4 Aug 2010, 18:13:56 UTC - in response to Message 1022545.  

Definitely different!! :-)


PROUD MEMBER OF Team Starfire World BOINC
ID: 1022547 · Report as offensive
Profile ex_brit
Avatar

Send message
Joined: 14 Feb 04
Posts: 182
Credit: 431,839
RAC: 0
Canada
Message 1022549 - Posted: 4 Aug 2010, 18:23:23 UTC - in response to Message 1022547.  

I see....I think...LOL
Peter.
Toronto, Canada

ID: 1022549 · Report as offensive
Ian Green

Send message
Joined: 25 Jul 10
Posts: 24
Credit: 102,337
RAC: 0
Canada
Message 1022566 - Posted: 4 Aug 2010, 20:12:10 UTC - in response to Message 1022549.  

When they want the WU, my machine will simply wait. I run the BOINC after hours so its no concern when I am working on some project or another.

Last time I looked there was nothing to do and all of the tasks were completed. This is likely due to my GPU being relatively fast.

I speculate that the servers are having problems and until the team can fix them I guess the completed WU file will sit and wait.
ID: 1022566 · Report as offensive
Profile Bill Walker
Avatar

Send message
Joined: 4 Sep 99
Posts: 3868
Credit: 2,697,267
RAC: 0
Canada
Message 1022599 - Posted: 4 Aug 2010, 23:37:17 UTC - in response to Message 1022566.  

I speculate that the servers are having problems and until the team can fix them I guess the completed WU file will sit and wait.


Check the notice on the S@H home page. The upload and download servers are off, every Tuesday AM to Friday AM. You're right, BOINC will upload your work when it can, all by itself.

ID: 1022599 · Report as offensive
Ian Green

Send message
Joined: 25 Jul 10
Posts: 24
Credit: 102,337
RAC: 0
Canada
Message 1022606 - Posted: 5 Aug 2010, 0:36:49 UTC

Seem to be a lot of server issues, I wonder if everything is working OK.
ID: 1022606 · Report as offensive
Profile Pappa
Volunteer tester
Avatar

Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 1022618 - Posted: 5 Aug 2010, 2:03:31 UTC
Last modified: 5 Aug 2010, 2:04:16 UTC

To answer some of the questions. The exteneded outages are to upgrade Servers and to Finish getting NITPCKR running full time.

The normal weekly outaged was used to uncrash whatever server issue and there was no time to find the root cause. Many Servers had different OS's that could be a contributing factor. Matt has noted that he plans to shift a new Server in place and then start balancig the resources (which means smoother operations).

The other part is Seti is receiving new code that accounts for different Duration Correction Factor (DCF) for different resources CPU/GPU. The short story is as we come out the other end of the tunnel things will be more accurate.

Many things are happening that I am aware of and others that I am not. I do know that Jeff/Matt have attempted to keep us abreast of some changes but we do not know the full impact. The Full impact can only be known after the extended outage. Each outage is different.

These threads are an effort to setup where any of the Staff can look and not crash the Boinc DB which runs these forums (it also is part of Uploads/Downloads and Scheduler requests). The other part is someone attempting tell of a problem. The NC Forum is where to report that, not a post in the Tech news. Here you can get help that does not take staff time to look at something easily solved. To be hoest, they do not have the Time or the Moeny (Seti) to tell you to reboot your computer. Users do that taking time d care.

So there is a lot happening, there are a few that are reporting back according to what You say.

Regards

Pappa
Please consider a Donation to the Seti Project.

ID: 1022618 · Report as offensive
Profile VK1PE

Send message
Joined: 19 Apr 01
Posts: 9
Credit: 1,676,526
RAC: 0
Australia
Message 1022638 - Posted: 5 Aug 2010, 4:27:41 UTC - in response to Message 1022618.  

I can live with the project needing to do upgrades. That's fine by me!

I have my own Outage avoider: Years ago I set my "Additional work buffer" to 10 days (the maximum) so I always have WUs.

Peter
ID: 1022638 · Report as offensive
Profile ex_brit
Avatar

Send message
Joined: 14 Feb 04
Posts: 182
Credit: 431,839
RAC: 0
Canada
Message 1022657 - Posted: 5 Aug 2010, 10:02:34 UTC - in response to Message 1022638.  

Thanks Pappa.
Peter.
Toronto, Canada

ID: 1022657 · Report as offensive
Profile Ray_GTI-R
Avatar

Send message
Joined: 17 May 99
Posts: 56
Credit: 276,906
RAC: 0
United Kingdom
Message 1022874 - Posted: 6 Aug 2010, 1:22:03 UTC

I'm a recently returned, long-time, pre-BOINC SAH cruncher.
News is helpful.
Server upgrades are inevitable.
Unravelling issues can take forever.
Maybe a new approach is needed?

Meanwhile ...
I recently added 2 more (home) PC's so I knew, as a recent "new" starter, about the 3-day nothing-happens situation
My initial return-to-SAH via BIONC was not helpful.
Picture this:-
At first installation ... nothing's happening just lots of entries in BOINC messages about unable to connect to server. Check/re-install/repair. No joy. Then a quick check of other forum messages and a PM to a very helpful cruncher relaxed me. The problem wasn't mine.

How many people join on a "Tuesday morning" (evening where I live) try, try again, then again maybe days later then just give up?

One simple Message via BOINC would keep people happy ... maybe "Planned project outage started (date/time). Planned resume expected (date/time)" i.e., not your fault, don't spend hours/days trying to sort out a solution for a problem that doesn't exist or bother other crunchers who may/may not reply.

Just trying to be helpful here!
The difference between 0 and 1 is greater than the difference between 1 and 1,000,000
ID: 1022874 · Report as offensive
Profile AllenIN
Volunteer tester
Avatar

Send message
Joined: 5 Dec 00
Posts: 292
Credit: 58,297,005
RAC: 311
United States
Message 1022888 - Posted: 6 Aug 2010, 2:30:19 UTC - in response to Message 1022618.  

Pappa,

Any idea why they keep everything down overnight on Thursday. Seems that once they stop doing whatever they are doing on Thursday, they might just as well fire it back up for the evening.

Seems that my dual processor machine never has enough WU's to make it through the shutdown and I'm already set for 10 days.

Allen
ID: 1022888 · Report as offensive
B-Man
Volunteer tester

Send message
Joined: 11 Feb 01
Posts: 253
Credit: 147,366
RAC: 0
United States
Message 1022893 - Posted: 6 Aug 2010, 2:47:16 UTC - in response to Message 1022874.  

I'm a recently returned, long-time, pre-BOINC SAH cruncher.
News is helpful.
Server upgrades are inevitable.
Unravelling issues can take forever.
Maybe a new approach is needed?

Meanwhile ...
I recently added 2 more (home) PC's so I knew, as a recent "new" starter, about the 3-day nothing-happens situation
My initial return-to-SAH via BIONC was not helpful.
Picture this:-
At first installation ... nothing's happening just lots of entries in BOINC messages about unable to connect to server. Check/re-install/repair. No joy. Then a quick check of other forum messages and a PM to a very helpful cruncher relaxed me. The problem wasn't mine.

How many people join on a "Tuesday morning" (evening where I live) try, try again, then again maybe days later then just give up?

One simple Message via BOINC would keep people happy ... maybe "Planned project outage started (date/time). Planned resume expected (date/time)" i.e., not your fault, don't spend hours/days trying to sort out a solution for a problem that doesn't exist or bother other crunchers who may/may not reply.

Just trying to be helpful here!

Looking at messages posted on the BOINC dev forums it looks like BOINC project messages are being looked at. I think it would be a good idea but it is still being looked at and questions about how to implement.
ID: 1022893 · Report as offensive
Ian Green

Send message
Joined: 25 Jul 10
Posts: 24
Credit: 102,337
RAC: 0
Canada
Message 1022901 - Posted: 6 Aug 2010, 3:30:36 UTC - in response to Message 1022893.  

I am a C++ developer and I have experience with Windows and Linux if you guys are stuck.

I use Ubuntu for my own server appliance as its got a huge community of users that make their forum very helpful.

I am planning to use a few servers soon for backup purposes as USB disks are limited in some respects.
ID: 1022901 · Report as offensive
Profile Pappa
Volunteer tester
Avatar

Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 1022917 - Posted: 6 Aug 2010, 5:27:51 UTC - in response to Message 1022874.  

I'm a recently returned, long-time, pre-BOINC SAH cruncher.
News is helpful.
Server upgrades are inevitable.
Unravelling issues can take forever.
Maybe a new approach is needed?

Meanwhile ...
I recently added 2 more (home) PC's so I knew, as a recent "new" starter, about the 3-day nothing-happens situation
My initial return-to-SAH via BIONC was not helpful.
Picture this:-
At first installation ... nothing's happening just lots of entries in BOINC messages about unable to connect to server. Check/re-install/repair. No joy. Then a quick check of other forum messages and a PM to a very helpful cruncher relaxed me. The problem wasn't mine.

How many people join on a "Tuesday morning" (evening where I live) try, try again, then again maybe days later then just give up?

One simple Message via BOINC would keep people happy ... maybe "Planned project outage started (date/time). Planned resume expected (date/time)" i.e., not your fault, don't spend hours/days trying to sort out a solution for a problem that doesn't exist or bother other crunchers who may/may not reply.

Just trying to be helpful here!


Ray, Welcome Home!

There have been so many things that have changes since Seti Classic.

So as will be mentioned in another post many things are being worked on on the Server and the Boinc Client level.

I will leave that there for now.

Regards

Please consider a Donation to the Seti Project.

ID: 1022917 · Report as offensive
1 · 2 · 3 · Next

Message boards : Number crunching : Extended Outage August 3 2010 Problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.