Panic Mode On (13) Server problems

Message boards : Number crunching : Panic Mode On (13) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 13 · Next

AuthorMessage
Daniel Ahlborn
Volunteer tester

Send message
Joined: 13 Sep 04
Posts: 8
Credit: 1,510,541
RAC: 0
Germany
Message 868739 - Posted: 23 Feb 2009, 20:03:11 UTC - in response to Message 868725.  
Last modified: 23 Feb 2009, 20:07:43 UTC

Thats no excuse.

Besides that "Seti does not promise..." blablabla i have heard quite often enough.


Thats confirm this "Corporate thinking" issue what another member posted previously.

"If there is a problem, find someone else u can blame it on, but dont fix it at ur expenses-----> no problem!"

This ignorance is what really makes me angry in that matter.

I repeat it again, they want something from me, so i deceidet to give it to them. asking nothing in return, but a reliable infrastructure for their own sake.


Those failures seem to me like they did it in purpose.

Frequent network failures may let people believe that they need more funds to build a more robust Network there.

What a nice marketing strategy, i think of recommending that to other non profit organisations too.
Just screw the jobs to be done, and then come wining to the public u need more funds.

Some non-brain may fall for it. This strategy is cheap, and not even new, nigerian advance fee fraudsters use it successfully for decades.
ID: 868739 · Report as offensive
Profile Westsail and *Pyxey*
Volunteer tester
Avatar

Send message
Joined: 26 Jul 99
Posts: 338
Credit: 20,544,999
RAC: 0
United States
Message 868740 - Posted: 23 Feb 2009, 20:03:31 UTC - in response to Message 868734.  

So the attitude from 'the important people' seems to be 'If You Don't Like It, **** ***'. No problem, I'm good with that.

Have to second this. Like shoots eh, we do what we can. No worries. At least everyone is equal, no one can d/l atm.
"The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov
ID: 868740 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 868741 - Posted: 23 Feb 2009, 20:04:01 UTC - in response to Message 868723.  
Last modified: 23 Feb 2009, 20:06:11 UTC

I used to 'babysit' SETI but over the past year or more I've gone more to plan B which is to de-emphasize SETI in my collection of projects. Out of seven active projects it is currently 4 - ahead of Grid, Rosetta and Einstein. When we encounter more extended SETI outages, or for the regular 8 to 12 hour Tuesday cycle, I do a couple of things -- I tend to go to 'no new work' for SETI (like with the current problem) or, on Tuesday, I clear reports in the morning and then suspend SETI uptil Tuesday night -- so that my workstations don't try to communicate with the servers until they have had a chance to recover.

The thing is, the SETI project HAS gotten more fragile of late with the 8 to 12 hour maintenance cycle (4 to 6 hours offline plus an equal amount of time for recovery), along with the weekend upload bafflement.

As I noted elsewhere, SETI is an established project and by far the largest of the BOINC projects -- because of that, when things go bump, they escalate rather quickly. The project simply lacks the bandwidth or processing headroom to handle any 'oopsies' gracefully nor recover quickly




I am really getting sick of babysitting seti on my fleet all the time.

Why do you have to baby sit your fleet?
I've been away for the last 4 & half weeks. Looking at my RAC over that period there have been dips & spikes as problems have occured, and then been resolved. Just as there will be after this has come & gone. It's taken no intervention on my part, BOINC downloads the work when it's available & returns the results when able.

ID: 868741 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13987
Credit: 208,696,464
RAC: 304
Australia
Message 868742 - Posted: 23 Feb 2009, 20:04:01 UTC - in response to Message 868734.  

So the attitude from 'the important people' seems to be 'If You Don't Like It, **** ***'.

Haven't seen that posted in this thread.
People have explained why things are the way they are. They have pointed out what would be required to have 24/7 uptime, timely notification of when there's a problem & what's being done about it etc, etc (ie a [i]lot[i/] more money than is presently avaiable).
If you still consider it to be unacceptable then there's not much we or anyone else can do, and it's your choice whether you stay & see things through or go elsewhere.
Grant
Darwin NT
ID: 868742 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 868743 - Posted: 23 Feb 2009, 20:05:33 UTC
Last modified: 23 Feb 2009, 20:21:58 UTC



I think it's a new peak record! ;-D

[~ 12.5 k in and out..]



EDIT:


On the weekly graph it's not so dramatically..

[~ 11.5 k in and ~ 10 k out]
ID: 868743 · Report as offensive
Profile David @ TPS

Send message
Joined: 30 Sep 04
Posts: 70
Credit: 11,323,275
RAC: 0
United States
Message 868748 - Posted: 23 Feb 2009, 20:10:41 UTC

Seems an easy fix.

Shut down AP & CUDA. (Completely Useless, Danged Atrocious)

MB works well when the others are not stealing the bandwidth.

We had the same lack of communication when Classic was dying.

Go back to the basics and let the crunchers do what we do best..... CRUNCH!

JMHO




ID: 868748 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13987
Credit: 208,696,464
RAC: 304
Australia
Message 868753 - Posted: 23 Feb 2009, 20:16:32 UTC - in response to Message 868739.  

Thats no excuse.

Nope, they're reasons.


"If there is a problem, find someone else u can blame it on, but dont fix it at ur expenses-----> no problem!"

Sorry, i don't follow.
It costs money to run things, it costs money to fix things. In short, it costs money. When money is limitited, then you have to juggle the priorities.
The good thing about BOINC is that it is designed to cope with unreliable services. If it can't upload or download, it will try again later. If you want your machine to be busy with just the one project, then it allows you to have a cache of work to see you thoguh outages.
So there is no need for 24/7 availability of work, or returning of work. The money saved by not having 24/7 uptime can then be used for other things, such as keeping the project going.


I repeat it again, they want something from me, so i deceidet to give it to them. asking nothing in return, but a reliable infrastructure for their own sake.

If that is your requirement for the use of your system when you're not using it, then i'm afraid this isn't a suitable project for yourself.


Those failures seem to me like they did it in purpose.

Frequent network failures may let people believe that they need more funds to build a more robust Network there.

What a nice marketing strategy, i think of recommending that to other non profit organisations too.
Just screw the jobs to be dont, and then come wining to the public u need more funds.
etc, etc.
Of course, i'm sure they love nothing more than to be accused of criminal behaviour.
Grant
Darwin NT
ID: 868753 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 868755 - Posted: 23 Feb 2009, 20:17:40 UTC - in response to Message 868743.  
Last modified: 23 Feb 2009, 20:18:29 UTC

No , I'am manually, PUSHING some AP WU's, in ;)

But hope this is solved, by tomorrow evenin' , :)
It, DOES get annoying and needs a solution, better network = GigaBIT connection, but how much that will cost, I've seen a lot of fundraising idea's.
SEEMS NESSECARY, or APPEARS to be!
ID: 868755 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 868765 - Posted: 23 Feb 2009, 20:27:53 UTC - in response to Message 868757.  

I just managed to get all my work returned, but I am going to leave it on NNT for the time being.

ID: 868765 · Report as offensive
HAL

Send message
Joined: 28 Mar 03
Posts: 704
Credit: 870,617
RAC: 0
United States
Message 868772 - Posted: 23 Feb 2009, 20:39:55 UTC - in response to Message 868717.  

I wish they could fix the system so it would accept the ones that make it to 100% upload and not set the upload time to start over.
I currently have 6 units at 100% of the 9 work units trying to upload.
Timer is counting down to re-upload the those 6.
If I have more than half of mine are at 100% I wonder how many other users are out there that are in the same boat.....
If they could fix this problem it would go a long way fixing the traffic problem.

I'm no tekkie but I think the response time to acknowledge a data transmission Is Not readily programmable, I get LOTS of 100% stuff that gets re-sent because the server was too busy to acknowledge in the time my system expected it. It's like trying to drain lake Superior into the colorado river through a fire hose in one day. To fix the problem you ask them to reprogram every internet connection - NICE TRICK IF IT COULD BE DONE. The ultimate solution is to consider that the MAIN goal is to FIND ALIENS, and this may take your lifetime AND MINE ( pushing 70 ) and if their system hiccups - at least you never get left in the cold- eventually it all winds up making muffins.
ID: 868772 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 868779 - Posted: 23 Feb 2009, 20:49:29 UTC - in response to Message 868734.  

So the attitude from 'the important people' seems to be 'If You Don't Like It, **** ***'. No problem, I'm good with that.

I've not seen that, or anything like it, from anyone who actually works for the project -- even the relatively "unimportant" people involved directly.

If anything, I've seen some pretty extraordinary effort from people who are basically employed for an 8 hour day, 5 days per week, and are fixing things on Saturday night.

I've also said repeatedly that we seem to expect more from the BOINC servers than the BOINC client itself really needs.

I find that if I set "connect every 'x' days" to 1, and extra days to 4, I don't seem to run out of work.

Sometimes work finishes on Sunday and doesn't report until Tuesday, but I still get credit for it.
ID: 868779 · Report as offensive
Aurora Borealis
Volunteer tester
Avatar

Send message
Joined: 14 Jan 01
Posts: 3075
Credit: 5,631,463
RAC: 0
Canada
Message 868799 - Posted: 23 Feb 2009, 21:29:26 UTC
Last modified: 23 Feb 2009, 21:33:10 UTC

I've always been fascinated by the resurgence of complainers every time Seti has a hiccup. Usually we go 3 to 4 months between bad weekend. BIG DEAL s**t happens.

I have 15 projects in my Boinc Manager.

Predictor - The project manager became a total a**hole. People were banned. People quit the project. They moved their servers. Project only exists as a forum for a year now.

LHC - Does work for a billion dollar multi-governmental project. Has no money for Boinc. Once in a while put out random set of test work.

Protein@home - Alpha project - finished phase 2 in June - Phase 3 was to start at end of December. No news yet...

Project Tanpaku - Lost severs 18 Aug. Recent note indicates that they are trying to recover data to publish and hope to get project back up if their backer come through.

Malariacontrol - Been down since Dec. - I haven't been following it closely enough to know the reason, but it seems to have been planned.

SZTAKI - This one is always fun. A while back they started a new series. Got 40 WU that the project estimated at a couple of hours. Turn out they were 1000+ hrs WUs. Project reset. Changed their WUs and it worked well for several months. A week ago got 30 WU estimated at a couple of hours. Most are running around 40 hrs. and a lot of them are not validating.

Boincsimap - Usually only supplies work for a week or so at the beginning of the month.

Most of the others including Seti usually only have minor hiccup which I usually don't even notice since they only last a day or so.

This was the reason Boinc was designed to manage multiple projects. It allows for the unpredictable extremes that you will find in under funded projects while allowing the volunteers to still feel they can contribute in a small way to the advancement of science.

Boinc V7.2.42
Win7 i5 3.33G 4GB, GTX470
ID: 868799 · Report as offensive
Profile Gylling

Send message
Joined: 23 Dec 02
Posts: 5
Credit: 4,507,311
RAC: 1
Sweden
Message 868800 - Posted: 23 Feb 2009, 21:30:02 UTC

I have the same communication problems on my Windows PCs,
but the Linux PC is able to go through the traffic jam with only a few retries.
ID: 868800 · Report as offensive
Aurora Borealis
Volunteer tester
Avatar

Send message
Joined: 14 Jan 01
Posts: 3075
Credit: 5,631,463
RAC: 0
Canada
Message 868805 - Posted: 23 Feb 2009, 21:42:01 UTC - in response to Message 868800.  

I have the same communication problems on my Windows PCs,
but the Linux PC is able to go through the traffic jam with only a few retries.

Random luck. Some people are starting to get through. It will take a while for things to normalize. This as been a long outage, and they will probably take the servers offline for backup tommorow as usual.

Boinc V7.2.42
Win7 i5 3.33G 4GB, GTX470
ID: 868805 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 868809 - Posted: 23 Feb 2009, 22:07:02 UTC

ID: 868809 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 868812 - Posted: 23 Feb 2009, 22:29:06 UTC - in response to Message 868228.  

Good point about a targeted cash donation - rather than see cash donations 'trickled' into other hardware when the bandwidth limitation is apparently the first order constraint.



Bandwidth is maxed out at Seti@Home, I terminated My network activity in Boinc until much later. To Increase bandwidth I've read would cost $80,000.00 total.

Go here and read and maybe donate to the cause and If one does donate, Make It a specific donation.


ID: 868812 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 868815 - Posted: 23 Feb 2009, 22:35:01 UTC - in response to Message 868799.  
Last modified: 23 Feb 2009, 22:38:03 UTC

And an unmanaged forum at that -- which is sort of fun to watch as a psych experiment.

[quote]
Predictor - The project manager became a total a**hole. People were banned. People quit the project. They moved their servers. Project only exists as a forum for a year now.

/[quote]

As to Malaria -- I think they are shorthanded in terms of software engineering and so have been unable to roll out a new wave of work units. Unfortunate really, as unlike SETI they are focused on scientific work. SETI has become more of a combined social experiment and proof of concept effort regarding shared computing rather than something working on science these days.
ID: 868815 · Report as offensive
elgar

Send message
Joined: 21 May 99
Posts: 69
Credit: 2,687,478
RAC: 0
United States
Message 868829 - Posted: 23 Feb 2009, 23:00:30 UTC


Cheats wreak havoc on SETI@home: participants

Andrew Colley

30 October 2002 01:00 PM

Tags: boinc, distributed-computing, andew colley, seti, seti@home, cheat, project
SETI@home administrators are allegedly ignoring claims that the project is being sabotaged by miscreants who are threatening to derail its reputation and that of many valuable Internet-based distributed computing projects.

SETI@home is one of the highest-profile proof-of-concept projects for distributed computing. However its Berkeley-based administrators are, participants claim, ignoring allegations that cheating in the competition to contribute the most computing power to the project is rife....



There's a history of SETI ignoring the ignorant masses. It's a big reason why so many people gave up in disgust during the classic client days. Everything old is new again.

http://www.zdnet.com.au/news/security/soa/Cheats-wreak-havoc-on-SETI-home-participants/0,130061744,120269509,00.htm
ID: 868829 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 868831 - Posted: 23 Feb 2009, 23:01:48 UTC

It seems that some SETI BOINC advocates take umbrage at SETI being categorized as a high maintenance BOINC project -- they may have a point. I'd say it is a 'more maintenance' project compared to low maintenance projects like Climate, Einstein and a number of others. What SETI does have going for it are a number of dedicated support folks. It also has the highest volume of BOINC processing by far and that alone tends to make for somewhat more frequent 'interesting times' as well as longer recovery cycles. I am assuming that BOINC SETI advocates would acknowledge that.

ID: 868831 · Report as offensive
Profile Blurf
Volunteer tester

Send message
Joined: 2 Sep 06
Posts: 8964
Credit: 12,678,685
RAC: 0
United States
Message 868832 - Posted: 23 Feb 2009, 23:03:59 UTC

I am going to step in and get involved in this discussion here.

I do *NOT* work for Seti however I am involved strongly in the donations, both financial and hardware.

There is no "pay or don't crunch" attitude here--not at all. If someone said that they are wrong. Also saying these guys aren't doing their work is a downright insult.

Fact is they are SEVERELY underfunded and understaffed. Right now they can't afford to pay for someone to sit in the office all weekend to babysit the servers.

These guys work their tail off...and we need to see that.

Fact is if we want the project to perform better then we need to help them.


ID: 868832 · Report as offensive
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 13 · Next

Message boards : Number crunching : Panic Mode On (13) Server problems


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.