Message boards :
News :
Long outage...
Message board moderation
Author | Message |
---|---|
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
The outage ran long today because we needed to run down to the data center to swap some bad drives with new ones and reboot a few of the machine to pick up kernel and mysql updates. Sorry for the delay. @SETIEric@qoto.org (Mastodon) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Thanks for the update Eric. I was about to pack it in for the night. You pulled a long shift. Appreciated. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
Thanks for the update. It's nice to know what's going on when things aren't working. Grant Darwin NT |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
Thank for the update. Maybe things will run smoother now ... |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
Many of the Server Statuses are yet to update, and lots of Scheduler errors when reporting work. It's going to take while to recover from this outage... Grant Darwin NT |
Stargate (SA) Send message Joined: 4 Mar 10 Posts: 1854 Credit: 2,258,721 RAC: 0 |
Since we have not made contact yet I won't be going anywhere soon, but thanks for the update |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
And now we're starting to pick up work, downloads are stilling. Tried both 208.68.240.127 and 208.68.240.119 in my hosts file. No joy. EDIT- tried 208.68.240.119 again and stalled downloads cleared. Grant Darwin NT |
Stargate (SA) Send message Joined: 4 Mar 10 Posts: 1854 Credit: 2,258,721 RAC: 0 |
Think they loaded work b4 clocking off? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
Think they loaded work b4 clocking off? There were already around 60 files there to be split, and it usually takes about 2 hours after an outage for the splitters to get going again. Grant Darwin NT |
marsinph Send message Joined: 7 Apr 01 Posts: 172 Credit: 23,823,824 RAC: 0 |
Hello Grant what is the name of 208.68.240.127 ? I know 119 : boinc2ssl.berkeley.edu (download server) but not the 127 ! So what is the name ? I will add to my host file) And now we're starting to pick up work, downloads are stilling. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
Hello Grant what is the name of 208.68.240.127 ? Same name. There are 2 servers that handle downloads, the load is meant to be shared between them, but if one has issues then you can't download from it and have to wait till the other server gets used. Generally I keep that commented out, and just use whichever address is working when download issues occur. In the past when things went wrong it's generally been 208.68.240.127 that has had the problems. Grant Darwin NT |
rob smith Send message Joined: 7 Mar 03 Posts: 22526 Credit: 416,307,556 RAC: 380 |
Running down the hill isn't too bad, but running back up again :-( Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
NorthCup Send message Joined: 6 Jun 99 Posts: 108 Credit: 50,093,984 RAC: 5 |
The lighthouse project of distributed computing - Seti - limp as a sick horse - stutters and paralyzes. Tow from one potion to the next and faints regularly. What can we do for you? You already have money, hardware and my full readiness from me - from us. What else do you need for Seti to become a racehorse? greetings Klaus |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
The lighthouse project of distributed computing - Seti - limp as a sick horse - stutters and paralyzes. Tow from one potion to the next and faints regularly. What can we do for you? You already have money, hardware and my full readiness from me - from us. What else do you need for Seti to become a racehorse? greetings KlausEnough staff. |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 |
With all due respect: * I wish we could know more about the seti organization, but in general the first response organizations have to "production" problems is to "hire more staff". But truthfully in most contexts, setting wise management policies and priorities is more effective than more people. Please. Being a university project is no excuse for poor execution, if in fact that is what is happening. * Looking at other DC projects recently it appears that Seti has the most inhomogeneous bespoken set of hardware in BOINC, and the software manageability may be worse. Perhaps, it is time to migrate to a more "planned" architecture. Temporarily scale back the project if necessary to enable growth in the future. I'm sure that would be a lot of work. But then work on the web pages used to market the project because as it stands now some of those pages are so old or out of date that it gives Seti a black eye. * I really don't understand the GBT data transfer problem and its history. But how is it that we are depending on a resource (network bandwidth) that is well known to be too meager to meet the needs or demands? We should remember that Band Aids eventually fall off, frequently before the healing is complete. * As for today, without any tasks available why don't they just shutdown the project a day early and catch up on some tasks not getting done? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Work is available today from both observatories. The GBT-Berkeley data connection is on a network that is faster than the commercial Internet. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13854 Credit: 208,696,464 RAC: 304 |
With all due respect: Not when you don't have the staff to actually carry out those policies & priorities. 1 person can't do the work of 5 or more when 5 or more are necessary to do what's needed to meet those policies & priorities. Perhaps, it is time to migrate to a more "planned" architecture. I agree, just replacing the HDD current storage with AFAs (All Flash Arrays) would cure a lot if problems IMHO. But where are we going to get the $500,000 from? But how is it that we are depending on a resource (network bandwidth) that is well known to be too meager to meet the needs or demands? A 100Gb/s backbone really can't be considered meager. And getting things by network is more reliable than relying on people to load & unload & mail HDDs with even more points of failure than using a network connection. Grant Darwin NT |
NorthCup Send message Joined: 6 Jun 99 Posts: 108 Credit: 50,093,984 RAC: 5 |
The lighthouse project of distributed computing - Seti - limp as a sick horse - stutters and paralyzes. Tow from one potion to the next and faints regularly. What can we do for you? You already have money, hardware and my full readiness from me - from us. What else do you need for Seti to become a racehorse? greetings KlausEnough staff. I understand. In Germany volunteer helpers are available for such projects. Would that be a solution for your problem? Have you ever tried to recruit staff from the circle of Seti enthusiasts? Greetings, Klaus |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.