Panic Mode On (21) Server problems

Message boards : Number crunching : Panic Mode On (21) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 12 · Next

AuthorMessage
zpm
Volunteer tester
Avatar

Send message
Joined: 25 Apr 08
Posts: 284
Credit: 1,659,024
RAC: 0
United States
Message 919031 - Posted: 18 Jul 2009, 15:40:19 UTC - in response to Message 919030.  

beta side is down (webpages still up though).
ID: 919031 · Report as offensive
Profile # Bob Ahlers #

Send message
Joined: 30 Mar 01
Posts: 18
Credit: 10,209,954
RAC: 0
Netherlands
Message 919084 - Posted: 18 Jul 2009, 18:43:01 UTC - in response to Message 919031.  

Can someone please tell me why the *&%&^% i still can not upload and/or download new WU's. It's not my machines!.

The Scheduler is not responding, Uploading does not work and getting new WU's also not.

This is the second time within one week that 5x dual Core2Quad machines all idle.

Damn!
ID: 919084 · Report as offensive
Profile Byron S Goodgame
Volunteer tester
Avatar

Send message
Joined: 16 Jan 06
Posts: 1145
Credit: 3,936,993
RAC: 0
United States
Message 919089 - Posted: 18 Jul 2009, 18:55:49 UTC - in response to Message 919084.  
Last modified: 18 Jul 2009, 19:07:20 UTC

Hi,

As you can see from the Cricket Graph

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=%2Frouter-interfaces%2Finr-250%2Fgigabitethernet2_3;view=Octets;ranges=d%3Aw%3Am%3Ay

the project is recovering from another recent lengthy outage, so it will be a while before they are flowing again.
ID: 919089 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 21227
Credit: 7,508,002
RAC: 20
United Kingdom
Message 919092 - Posted: 18 Jul 2009, 18:57:44 UTC - in response to Message 919084.  
Last modified: 18 Jul 2009, 19:00:31 UTC

Can someone please tell me why the *&%&^% i still can not upload and/or download new WU's. It's not my machines!.

The Scheduler is not responding, Uploading does not work and getting new WU's also not.

This is the second time within one week that 5x dual Core2Quad machines all idle.

Damn!

Try some patience there now?

Attach to another project of interest to soak up some idle CPU slack?

At the moment, the s@h downlink is saturated. That makes it look like its blocked or that some of the s@h servers are dead/overloaded. The blocked downlink also in effect blocks the uplink because no/few handshake data packets make it through.

Hence, for s@h, you cannot easily get WUs or return WU results at the moment.

See:
http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=%2Frouter-interfaces%2Finr-250%2Fgigabitethernet2_3;view=Octets;ranges=d

90Mbit/s of green and above means that the downlink is saturated.

Further details on:
(NOT) Working As Expected


Boinc will blunder through the muddle. Hopefully, a few fixes will be programmed in also.

Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 919092 · Report as offensive
Profile TerryG
Avatar

Send message
Joined: 11 Mar 01
Posts: 16
Credit: 15,351,703
RAC: 37
United Kingdom
Message 919095 - Posted: 18 Jul 2009, 19:10:40 UTC - in response to Message 919092.  
Last modified: 18 Jul 2009, 19:11:18 UTC

Not sure it's going to get better soon either. The site was more or less closed this morning (UK time) to allow the replica to catch up (I believe). However, I've noticed that is slowly slipping further behind.

Best to follow ML1's advice and attach to another project while you wait. It will sort itself out eventually.

That's what you get for being part of such a popular project!
ID: 919095 · Report as offensive
HTH
Volunteer tester

Send message
Joined: 8 Jul 00
Posts: 691
Credit: 909,237
RAC: 0
Finland
Message 919097 - Posted: 18 Jul 2009, 19:20:16 UTC - in response to Message 919084.  

This is the second time within one week that 5x dual Core2Quad machines all idle.


Rosetta@home
World Community Grid
AQUA@home
QMC@home
ID: 919097 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9958
Credit: 103,452,613
RAC: 328
United Kingdom
Message 919100 - Posted: 18 Jul 2009, 19:29:06 UTC

I had recently shut down all of my crunchers due to a spell in hospital. When I try to restart them all I get is continued failures to upload or download. I have been with SETI@Home since 1999 and would really love to continue to crunch, but you cannot have several machines sitting around waiting to upload and you can't switch them off as that will loose the work and upset your wingman.

I will set no new tasks and crunch something that might actually be useful .

Bernie
ID: 919100 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 919107 - Posted: 18 Jul 2009, 20:20:04 UTC
Last modified: 18 Jul 2009, 20:27:56 UTC


I think it wasn't a well idea of SETI@home to publish a CUDA GPU app.
[in December '08 / January '09]

Since February I have 'only' 2 GPUs. February, March, April.. the PC had enought work. Since May I added 2 more GPUs. And then..

Since ~ 2 or more months only server troubles.

Why they published this app, if the server can't support the higher traffic?
Maybe because of the added GPUs of the members?

My two more GPUs are not the reason, or? ;-)


Ohh well.. really.. SETI@home make no fun..
In past with my QX6700.. RAC stock ~ 4,500.. nice life.. no 'babysitting'..

Now with my GPU cruncher.. RAC maybe ~ 32,000, only trouble..
Yes, it's the current top_host_#2 .. http://setiathome.berkeley.edu/top_hosts.php
All the time I must look: enough work, VLAR backlog work request*, UL/DL well, and so on..?
But.. I think for a stress-free life I should disable the cruncher.

What for a well reason I should have if the GPU cruncher is idle ~ 25 % or more of the running time?


Yes, you couldn't help me.
You think maybe now.. hey - why he 'suffer'..?
Only Berkeley could help me.
I thought it would be a well idea to support my one and only loved project with the for me payable equipment.. uhh - and for me lot of money is gone..
But.. after buying/building the GPU cruncher..
I'm really disappointed/frustrated.. and I think SETI@home don't want my GPU cruncher.


Why it's not possible to have a well server connection like in past.. maybe like before 1/2 year?


[* no - I don't use the rebranding CPU/GPU tool, it's a pure GPU cruncher for best/max. GPU performance]

ID: 919107 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 919113 - Posted: 18 Jul 2009, 20:48:16 UTC - in response to Message 919107.  

I thought it would be a well idea to support my one and only loved project with the for me payable equipment.. uhh - and for me lot of money is gone..

You chose to buy the new video card- no one forced you. If you wanted to support the project you could have just sent them that money instead.


Why it's not possible to have a well server connection like in past.. maybe like before 1/2 year?

Increased load from CUDA & more active crunchers= greater load.


I have yet to run out of any work through all the issues that have been occuring over the last few months. I've only run work a couple of time since Seti moved to BOINC.
And all with a small 4 day cache.
Grant
Darwin NT
ID: 919113 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 919115 - Posted: 18 Jul 2009, 20:58:44 UTC - in response to Message 919107.  
Last modified: 18 Jul 2009, 20:58:59 UTC


What for a well reason I should have if the GPU cruncher is idle ~ 25 % or more of the running time?


Nice.. nowadays mine seems to be occupied at top 50% of it's time, it's mostly idle for me :-/

Well these things will go by in time, you only need to wait a bit and have patience..

Regards Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 919115 · Report as offensive
Profile Byron S Goodgame
Volunteer tester
Avatar

Send message
Joined: 16 Jan 06
Posts: 1145
Credit: 3,936,993
RAC: 0
United States
Message 919124 - Posted: 18 Jul 2009, 21:39:08 UTC

Just an observation, but since the recent outage I'm assuming when bringing the upload server back up, Eric's change is no longer in effect and uploads are back to 21 seconds.
ID: 919124 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 919125 - Posted: 18 Jul 2009, 21:45:21 UTC


Some yes, some no.. maybe it have something to do with the errors?

Temporarily failed upload of xxxxxxxxxxxxxxxxxxxxxxxxxxxx: connect() failed

Temporarily failed upload of xxxxxxxxxxxxxxxxxxxxxxxxxxxx: HTTP error


ID: 919125 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 919128 - Posted: 18 Jul 2009, 21:53:36 UTC - in response to Message 919115.  

Nice.. nowadays mine seems to be occupied at top 50% of it's time, it's mostly idle for me :-/

Well these things will go by in time, you only need to wait a bit and have patience..

Regards Vyper


I don't know how you could be relaxed.. ;-)

Your GPU cruncher make the double RAC if stable.
And need double WUs/day than my 'small' GPU cruncher.

Yes.. I'm patience.. since I'm a SETI@home member I have this property.. (since two months more than before..) but don't know how long I can continue..

ID: 919128 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 919129 - Posted: 18 Jul 2009, 21:59:34 UTC - in response to Message 919113.  
Last modified: 18 Jul 2009, 22:04:24 UTC

...
I have yet to run out of any work through all the issues that have been occuring over the last few months. I've only run work a couple of time since Seti moved to BOINC.
And all with a small 4 day cache.


I have also only a ~ 4 day WU cache on my GPU cruncher..

..and ran out of work many times.

More isn't possible because of my slowly 'DSL light', unplanned server outages at Berkeley and BOINC which can't manage high WU cache.
If I reach ~ 4 day cache (3,500 WUs), BOINC is like frozen.. I press a button and ~ 30 seconds later maybe it happen.

ID: 919129 · Report as offensive
john deneer
Volunteer tester
Avatar

Send message
Joined: 16 Nov 06
Posts: 331
Credit: 20,996,606
RAC: 0
Netherlands
Message 919135 - Posted: 18 Jul 2009, 22:11:21 UTC - in response to Message 919107.  

In past with my QX6700.. RAC stock ~ 4,500.. nice life.. no 'babysitting'..

Now with my GPU cruncher.. RAC maybe ~ 32,000, only trouble..
Yes, it's the current top_host_#2 .. [url]

Hi Sutaru,

That´s the price you pay for living in the fast lane, I guess. Don´t worry, I´m sure better times will be coming.

I use these outages to improve my machines or do some maintenance. Over the last 3 days I got my i7 improved cooling by hooking it up to a water cooling setup (which I used for an 9450, previously). Temperatures at 4 GHz are much better now, I got them down by approx. 5 degrees Centigrade.

It´s frustrating sometimes, but just accept that you can´t do anything about it. Everybody else is having to cope with these ´bad times´ and everybody is taking a hit on their rac. Dealing with the bad times is just as much a part of the fun as is dealing with the good times when your rac is on the rise!

Regards,
John.
ID: 919135 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 919136 - Posted: 18 Jul 2009, 22:11:53 UTC - in response to Message 919124.  

Just an observation, but since the recent outage I'm assuming when bringing the upload server back up, Eric's change is no longer in effect and uploads are back to 21 seconds.


That is my thought as well, they no longer time out instantly and the uploads are flat-lined again.

ID: 919136 · Report as offensive
Profile Byron S Goodgame
Volunteer tester
Avatar

Send message
Joined: 16 Jan 06
Posts: 1145
Credit: 3,936,993
RAC: 0
United States
Message 919141 - Posted: 18 Jul 2009, 22:23:25 UTC

Well just a minute or so ago, it seems to be back, so starting to get somewhat better.
ID: 919141 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 919144 - Posted: 18 Jul 2009, 22:31:30 UTC - in response to Message 919084.  

Can someone please tell me why the *&%&^% i still can not upload and/or download new WU's. It's not my machines!.

The Scheduler is not responding, Uploading does not work and getting new WU's also not.

This is the second time within one week that 5x dual Core2Quad machines all idle.

Damn!

There are dozens of threads on this exact subject. We can restate them for you, or you can track them down yourself.

A combination of factors have contributed to a higher-than-normal load. That leads to too many machines trying to upload all at once. That leads to machines not downloading so they don't keep adding uploads when things are already backed up.

Solutions:

1) Don't worry about it, do nothing and things will get better.

2) Give you machines a little break.

3) Pick a second project, give it a small share of your resources, and let that project keep you busy while SETI is struggling. When things ease, BOINC will repay any extra time back to SETI.
ID: 919144 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 919147 - Posted: 18 Jul 2009, 22:42:52 UTC
Last modified: 18 Jul 2009, 22:51:42 UTC

There's just been a 10Mb/s drop in download traffic, and a corresponding 20Mb/s increase in upload traffic.


EDIT.
Ah, it appears AP has stopped slitting. Hence the drop in download traffic.
Grant
Darwin NT
ID: 919147 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 919149 - Posted: 18 Jul 2009, 22:47:48 UTC - in response to Message 919129.  


I have also only a ~ 4 day WU cache on my GPU cruncher..

..and ran out of work many times.

More isn't possible because of my slowly 'DSL light', unplanned server outages at Berkeley and BOINC which can't manage high WU cache.
If I reach ~ 4 day cache (3,500 WUs), BOINC is like frozen.. I press a button and ~ 30 seconds later maybe it happen.


Heh i would be lucky if i could have atleast a cache of one day :) and that's about 2000 Wu's..

"remember the time" With Michael Jackson :)

//Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 919149 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 12 · Next

Message boards : Number crunching : Panic Mode On (21) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.