Panic Mode On (13) Server problems

Message boards : Number crunching : Panic Mode On (13) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 868834 - Posted: 23 Feb 2009, 23:08:59 UTC - in response to Message 868799.  
Last modified: 23 Feb 2009, 23:10:17 UTC

I've always been fascinated by the resurgence of complainers every time Seti has a hiccup. Usually we go 3 to 4 months between bad weekend. BIG DEAL s**t happens.

I have 15 projects in my Boinc Manager....

You aren't a member of Orbit?

Turned loose a bunch of workunits in June last year. Took many times longer than estimated, and ate up hundreds of megabytes of disk space (and often didn't validate). Prime (sole?) developer hasn't posted anything of significance for 247 days. Independent "front page" website has only had one (derivative) update since 6 October 2008.
ID: 868834 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 868835 - Posted: 23 Feb 2009, 23:12:52 UTC - in response to Message 868834.  

Prime (sole?) developer hasn't posted anything of significance for 247 days.

Did you miss Pasquale's post then? It happened on the 29 Jan 2009, which as far as I know isn't 247 days ago. Not even when you're counting in those 42 hour days. ;-)
ID: 868835 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 868842 - Posted: 23 Feb 2009, 23:28:36 UTC - in response to Message 868835.  
Last modified: 23 Feb 2009, 23:53:04 UTC

Prime (sole?) developer hasn't posted anything of significance for 247 days.

Did you miss Pasquale's post then? It happened on the 29 Jan 2009, which as far as I know isn't 247 days ago. Not even when you're counting in those 42 hour days. ;-)

You mean "The server will be down for power interruptions for the next few days...", "...in the last few months we have been busy analyzing the results...", "We don't anticipate yet a date for the initial public testing.".

In the context of this debate, that counts as hot air and waffle. It doesn't come under my definition of "significance".

[I wish I didn't have to say that. Orbit is one of only seven projects I've ever joined - five projects, if you don't count Beta projects separately from their parents. I want to crunch for it, because I think it's equally important for the future development of the human race as SETI. But all they've run so far are simulations]
ID: 868842 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 868846 - Posted: 23 Feb 2009, 23:41:48 UTC - in response to Message 868831.  

It seems that some SETI BOINC advocates take umbrage at SETI being categorized as a high maintenance BOINC project -- they may have a point. I'd say it is a 'more maintenance' project compared to low maintenance projects like Climate, Einstein and a number of others. What SETI does have going for it are a number of dedicated support folks. It also has the highest volume of BOINC processing by far and that alone tends to make for somewhat more frequent 'interesting times' as well as longer recovery cycles. I am assuming that BOINC SETI advocates would acknowledge that.

SETI is probably running on the biggest collection of "scrounged" hardware anywhere in the known BOINC universe.

That doesn't help when times are tough.

I'd definitely call myself a BOINC SETI advocate. I think there are things that the BOINC client should do that would mitigate the problems somewhat, and Matt's post today in technical news is about one of those spots where BOINC should back down, and doesn't.

If they could slow BOINC down to keep a sustained 90% bandwidth use, things could charge along stunningly well. It's when you try to get to 105% that things get really yucky.

All of that said, I'm impressed with how well things do work.
ID: 868846 · Report as offensive
Profile BigWaveSurfer

Send message
Joined: 29 Nov 01
Posts: 186
Credit: 36,311,381
RAC: 141
United States
Message 868884 - Posted: 24 Feb 2009, 2:04:53 UTC - in response to Message 868625.  

For that, check the progress column in the Transfers tab. :-)


:-) That's where I am:

Project: SETI@home
File: 16ja09ab...
Progress: 100.00%
Size: 28.17/28.17 KB
Elapsed Time: 13:20
Speed: 0.00 KBps
Status: Retry in 03:12:46 (and counting)

2 of my uploads are like this.



I currently have 11 like that! Show totally transfered but for some reason still want to retry?!
ID: 868884 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 868886 - Posted: 24 Feb 2009, 2:10:02 UTC - in response to Message 868884.  

My last upload finally made the trip home. I'm now crunching away on new WUs.


PROUD MEMBER OF Team Starfire World BOINC
ID: 868886 · Report as offensive
Profile Graeme of Boinc UK

Send message
Joined: 25 Nov 02
Posts: 114
Credit: 1,250,273
RAC: 0
United Kingdom
Message 868894 - Posted: 24 Feb 2009, 2:21:32 UTC

For the whingers............

http://setiathome.berkeley.edu/tech_news.php
ID: 868894 · Report as offensive
JRL

Send message
Joined: 23 Feb 06
Posts: 31
Credit: 219,472
RAC: 0
United States
Message 868910 - Posted: 24 Feb 2009, 3:14:43 UTC - in response to Message 868772.  

I wish they could fix the system so it would accept the ones that make it to 100% upload and not set the upload time to start over.
I currently have 6 units at 100% of the 9 work units trying to upload.
Timer is counting down to re-upload the those 6.
If I have more than half of mine are at 100% I wonder how many other users are out there that are in the same boat.....
If they could fix this problem it would go a long way fixing the traffic problem.

I'm no tekkie but I think the response time to acknowledge a data transmission Is Not readily programmable, I get LOTS of 100% stuff that gets re-sent because the server was too busy to acknowledge in the time my system expected it. It's like trying to drain lake Superior into the colorado river through a fire hose in one day. To fix the problem you ask them to reprogram every internet connection - NICE TRICK IF IT COULD BE DONE. The ultimate solution is to consider that the MAIN goal is to FIND ALIENS, and this may take your lifetime AND MINE ( pushing 70 ) and if their system hiccups - at least you never get left in the cold- eventually it all winds up making muffins.


The problem could be in Boinc software. Maybe they could give the server more time to reply before setting the count down timer for another try....


ID: 868910 · Report as offensive
Profile The Uncle B's
Avatar

Send message
Joined: 3 Nov 05
Posts: 17
Credit: 16,820,181
RAC: 0
United States
Message 868942 - Posted: 24 Feb 2009, 5:53:06 UTC

Here's what I get:

2/23/2009 9:48:06 PM|SETI@home|Temporarily failed upload of 14ja09aa.29048.1295.15.8.146_0_0: HTTP error
2/23/2009 9:48:06 PM|SETI@home|Backing off 40 min 51 sec on upload of 14ja09aa.29048.1295.15.8.146_0_0
2/23/2009 9:48:06 PM|SETI@home|Started upload of 16ja09aa.24128.2526.10.8.87_0_0
2/23/2009 9:48:07 PM||Internet access OK - project servers may be temporarily down.


Anyone else having problems uploading Finished WUs?

Thanks!
350+ GHZ of power for the PC Perspective Seti Team We Are The Uncle B's

ID: 868942 · Report as offensive
Profile gizbar
Avatar

Send message
Joined: 7 Jan 01
Posts: 586
Credit: 21,087,774
RAC: 0
United Kingdom
Message 868945 - Posted: 24 Feb 2009, 6:02:50 UTC - in response to Message 868942.  

Hi, just to let you know, everybody is having trouble uploading and downloading. If you look at the message boards, this thread has been repeated multiple times. Matt provides a form of explanation here:-

Matt's Post

regards, Gizbar.



A proud GPU User Server Donor!
ID: 868945 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 868946 - Posted: 24 Feb 2009, 6:03:42 UTC

Yep, there has been massive gridlock getting to the project since the beginning of last weekend.

Things were just starting to free up yesterday. Probably will take until later in the week for everything to clear and get back to more or less normal.

Alinator
ID: 868946 · Report as offensive
Profile Phil

Send message
Joined: 4 Jul 00
Posts: 4
Credit: 1,345,007
RAC: 1
Australia
Message 868958 - Posted: 24 Feb 2009, 6:59:59 UTC

I am going through the same dilema, cant upload completed work, are the servers down or is this a generated message as its been going onfor days now?
ID: 868958 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 868959 - Posted: 24 Feb 2009, 7:20:54 UTC - in response to Message 868846.  


But it IS the *biggest* collection of hardware <smile>. A lot of hardware, with a LOT of duct tape.

One thing I'd like to see in a future iteration of the client, is for users to have the ability to *at the project level* to choose 'no network traffic'.



ID: 868959 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 869308 - Posted: 25 Feb 2009, 13:05:35 UTC

Wow! The cricket graph isn't maxed out anymore! I just checked at 1300utc and it was down to 87mbit. Still pretty high up there, but not maxed out, which means any connections should go through the first try. Pretty impressive.. the recovery happened a little faster than Matt expected. He was thinking it would begin to settle down toward the end of the day PST, but the sun hasn't even risen there yet.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 869308 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 869316 - Posted: 25 Feb 2009, 14:12:28 UTC - in response to Message 869308.  
Last modified: 25 Feb 2009, 14:13:03 UTC

LOL...

Yep, and they just set a new record for the longest period of sustained max bandwidth since the switch to Hurricane! :-)

Of course, it's a much better user experience with things working more or less correctly again. My hosts didn't seem to care too much either way though. ;-)

Alinator
ID: 869316 · Report as offensive
HarryM
Volunteer tester

Send message
Joined: 24 Jul 08
Posts: 68
Credit: 3,812,695
RAC: 0
United States
Message 869328 - Posted: 25 Feb 2009, 14:45:23 UTC

Does anyone know why the neither AP_validate is running? They both say "Not Running" Expanation is "Program failed or ran out of work". I don't think they ran out of work.

Also do they pertain to both AP 5.00 and 5.03?

ID: 869328 · Report as offensive
Profile Geek@Play
Volunteer tester
Avatar

Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 869386 - Posted: 25 Feb 2009, 17:50:07 UTC
Last modified: 25 Feb 2009, 17:52:34 UTC

Don't know why but it seems that Berkeley takes down the AP validators at the first sign of problems.

And since I prefere to crunch AP it really hurts.
Boinc....Boinc....Boinc....Boinc....
ID: 869386 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 869387 - Posted: 25 Feb 2009, 17:58:26 UTC

looks like the credits aren't getting moved along. I see that I have 26000 more credits on my page yet the stats are still sitting back from a couple days ago.


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 869387 · Report as offensive
Profile Mark Peters
Volunteer tester
Avatar

Send message
Joined: 5 Jul 02
Posts: 80
Credit: 588,422
RAC: 0
Belgium
Message 869406 - Posted: 25 Feb 2009, 19:29:26 UTC

Had about 6-10 wu's to report, waited for 4 complete days, it reported succesfully. Now my reports are going perfect, up- and downloads are going well now...Good job to the SETI team...bravo!

Mark
To boldly go where no human has gone before!
ID: 869406 · Report as offensive
-Bert-

Send message
Joined: 23 Mar 02
Posts: 152
Credit: 412,754
RAC: 0
Netherlands
Message 869413 - Posted: 25 Feb 2009, 19:49:01 UTC - in response to Message 869328.  

Does anyone know why the neither AP_validate is running? They both say "Not Running" Expanation is "Program failed or ran out of work". I don't think they ran out of work.

They're both back online now ;-)
ID: 869413 · Report as offensive
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 · Next

Message boards : Number crunching : Panic Mode On (13) Server problems


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.