Panic Mode On (79) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (79) Server Problems?

1 · 2 · 3 · 4 . . . 23 · Next
Author Message
Profile arkaynProject donor
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3720
Credit: 48,768,260
RAC: 1,737
United States
Message 1307257 - Posted: 18 Nov 2012, 2:59:22 UTC

Timeouts, Timeouts, Timeouts!!!!!!!
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5915
Credit: 61,693,408
RAC: 31,007
Australia
Message 1307260 - Posted: 18 Nov 2012, 3:19:28 UTC


And still more timeouts.
____________
Grant
Darwin NT.

Profile Jim_SProject donor
Avatar
Send message
Joined: 23 Feb 00
Posts: 4534
Credit: 19,053,795
RAC: 7,525
United States
Message 1307261 - Posted: 18 Nov 2012, 3:22:19 UTC
Last modified: 18 Nov 2012, 3:23:43 UTC

And still Very Boring!
Somebody KICK Something...Please.
____________

I Desire Peace and Justice, Jim Scott (Mod-Ret.)

TBar
Volunteer tester
Send message
Joined: 22 May 99
Posts: 1496
Credit: 53,028,983
RAC: 50,100
United States
Message 1307275 - Posted: 18 Nov 2012, 4:14:43 UTC - in response to Message 1307257.
Last modified: 18 Nov 2012, 4:50:02 UTC

Timeouts, Timeouts, Timeouts!!!!!!!

I snagged a couple more Time Outs. I thought the Virus Scanner Crash had something to do with the last ones, but, there was no crash this time. I ran down my nVidia cache so I could switch back to CUDA23s on my old card and was trying to refill it. Here are the times;
11/17/2012 10:23:17 PM | | Using proxy info from GUI
11/17/2012 10:23:17 PM | | Using HTTP proxy 173.224.215.173:3128
11/17/2012 10:23:24 PM | SETI@home | update requested by user
11/17/2012 10:23:30 PM | SETI@home | Sending scheduler request: Requested by user.
11/17/2012 10:23:30 PM | SETI@home | Reporting 7 completed tasks
11/17/2012 10:23:30 PM | SETI@home | Requesting new tasks for CPU and NVIDIA and ATI
11/17/2012 10:24:08 PM | SETI@home | Scheduler request completed: got 20 new tasks
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 20au12ad.15117.7020.140733193388035.10.204_2
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 08se12ac.14602.22153.140733193388044.10.79_2
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 08se12ac.14602.22153.140733193388044.10.137_2
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 08se12ac.14602.22153.140733193388044.10.158_2
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 08se12ac.15079.21335.140733193388045.10.209_2
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 09se12ab.18967.315158.140733193388048.10.17_3
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 09se12ab.18967.315158.140733193388048.10.20_3
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 09se12ab.18967.315158.140733193388048.10.23_3
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 10se12aa.21403.16018.140733193388040.10.29_3
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 09se12ab.18967.315158.140733193388048.10.32_3
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 26fe12ab.13174.480.140733193388045.10.226_4
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 30se12ad.8531.113216.140733193388035.10.209_3
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 30se12ad.8531.116488.140733193388035.10.29_3
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 29se12ab.27923.13155.140733193388044.10.10_3
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 30se12ad.696.120987.140733193388036.10.153_2
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 30se12aa.13934.25016.6.10.174_2
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 10se12aa.21403.16018.140733193388040.10.25_3
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 09se12ab.18967.315158.140733193388048.10.70_2
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 10se12aa.21403.16018.140733193388040.10.28_3
11/17/2012 10:24:08 PM | SETI@home | Resent lost task 09se12ab.18967.315158.140733193388048.10.73_2
11/17/2012 10:24:10 PM | SETI@home | Started download of 20au12ad.15117.7020.140733193388035.10.204
11/17/2012 10:24:10 PM | SETI@home | Started download of 08se12ac.14602.22153.140733193388044.10.79
11/17/2012 10:24:23 PM | SETI@home | Finished download of 20au12ad.15117.7020.140733193388035.10.204
11/17/2012 10:24:23 PM | SETI@home | Started download of 08se12ac.14602.22153.140733193388044.10.137
11/17/2012 10:24:24 PM | SETI@home | Finished download of 08se12ac.14602.22153.140733193388044.10.79
11/17/2012 10:24:24 PM | SETI@home | Started download of 08se12ac.14602.22153.140733193388044.10.158
11/17/2012 10:24:30 PM | SETI@home | Finished download of 08se12ac.14602.22153.140733193388044.10.137
11/17/2012 10:24:30 PM | SETI@home | Started download of 08se12ac.15079.21335.140733193388045.10.209
11/17/2012 10:24:38 PM | SETI@home | Finished download of 08se12ac.15079.21335.140733193388045.10.209
11/17/2012 10:24:38 PM | SETI@home | Started download of 09se12ab.18967.315158.140733193388048.10.17
11/17/2012 10:24:57 PM | | Suspending network activity - user request
11/17/2012 10:25:06 PM | | Using proxy info from GUI
11/17/2012 10:25:06 PM | | Not using a proxy
11/17/2012 10:25:11 PM | | Resuming network activity
11/17/2012 10:25:11 PM | SETI@home | Started download of 08se12ac.14602.22153.140733193388044.10.158...

Then I got;
2718613438 1088129501 18 Nov 2012 | 2:20:30 UTC 18 Nov 2012 | 2:30:53 UTC Timed out - no response
2718613436 1088129393 18 Nov 2012 | 2:20:30 UTC 18 Nov 2012 | 2:30:53 UTC Timed out - no response
2718613434 1088129387 18 Nov 2012 | 2:20:30 UTC 18 Nov 2012 | 2:30:53 UTC Timed out - no response...
http://setiathome.berkeley.edu/results.php?hostid=6797524&offset=0&show_names=0&state=6&appid=

I never got these without the proxy. At least I'm back to my limit...

Lee Gresham
Avatar
Send message
Joined: 12 Aug 03
Posts: 131
Credit: 102,080,179
RAC: 2,956
United States
Message 1307295 - Posted: 18 Nov 2012, 6:51:37 UTC

And just to make things even better how about mini work units when you do get some work. I bet that helps the scheduler overload!
____________
Delta-V

Profile Donald L. JohnsonProject donor
Avatar
Send message
Joined: 5 Aug 02
Posts: 6324
Credit: 768,569
RAC: 981
United States
Message 1307299 - Posted: 18 Nov 2012, 7:49:46 UTC

In the previous iteration of this thread, someone was talking about how Macs and other "weak sister" systems seem to be faring much better than Windows systems. Not wishing to add insult to injury, but all 4 of my systems, both WinXP and Mac G4s, are having little or no problem communicating with the Scheduler and servers. Timeouts happen about 1 in 10 attempts, and usually go through on the next try, depending on time of day (PST/UTC-8). Downloads are slow (1-5 KBps), but they get through.

All systems go through the same ADSL modem to AT&T PacBell. The G4 is direct-connect, the iBook has an AirPort Wifi card, and the XP boxes are on USB Wifi modems.

The G4s are running BOINC 6.10.58 and 6.10.60, caches set to 0 + 3 days; and both the XP boxes are running 7.0.28, caches are set at 2 + 0.5 days.

Hope this is useful information.
____________
Donald
Infernal Optimist / Submariner, retired

Profile S@NL Etienne Dokkum
Volunteer tester
Avatar
Send message
Joined: 11 Jun 99
Posts: 173
Credit: 17,390,897
RAC: 9,428
Netherlands
Message 1307333 - Posted: 18 Nov 2012, 10:36:55 UTC

some people have bashed the project so many times I wonder what they are in for ? Is it a competion who has the highest RAC or are there some people still here who know the core surpose of this project :

Finding E.T. some day in the future

If the project goes cold for a while due to circumstances beyond the control of the volunteers working at Berkeley ($$$$!) and you don't like it why don't you just move on... No hard feelings ;-)

I remember last year around this time we had a mega outage of over a month didn't we and most of you still came back...

I also have to work with limits, proxy's and a lot of "manual care" for the project, my RAC has also dropped about 4000 but it's the end result that counts.

Have a wonderful sunday anyway wherever you are on our tiny planet !
____________

Profile UliProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Feb 00
Posts: 10009
Credit: 5,470,266
RAC: 192
Germany
Message 1307352 - Posted: 18 Nov 2012, 13:20:22 UTC

+1
____________
Pluto will always be a planet to me.
Order your 15th Seti Anniversary Shirt today. Just PM me for details.
Cash Donation Specialist

Seti Ambassador

Profile Khangollo
Avatar
Send message
Joined: 1 Aug 00
Posts: 245
Credit: 36,410,524
RAC: 0
Slovenia
Message 1307360 - Posted: 18 Nov 2012, 14:06:15 UTC
Last modified: 18 Nov 2012, 14:07:56 UTC

If admins deserve "bashing" then it's this: they could literally take 3 minutes of their time and write a front page news post, mentioning timeout/network problems, so volunteers would know it's not problem on their side.
Right now, we have people posting their problems all over the place in every subforum, reinstalling boinc, rebooting, (un)plugging network cables, looking for proxies, etc...
____________

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5467
Credit: 313,411,747
RAC: 166,235
Brazil
Message 1307367 - Posted: 18 Nov 2012, 14:22:31 UTC - in response to Message 1307333.
Last modified: 18 Nov 2012, 14:23:03 UTC

some people have bashed the project so many times I wonder what they are in for ? Is it a competion who has the highest RAC or are there some people still here who know the core surpose of this project :

Finding E.T. some day in the future

If the project goes cold for a while due to circumstances beyond the control of the volunteers working at Berkeley ($$$$!) and you don't like it why don't you just move on... No hard feelings ;-)

I remember last year around this time we had a mega outage of over a month didn't we and most of you still came back...

I also have to work with limits, proxy's and a lot of "manual care" for the project, my RAC has also dropped about 4000 but it's the end result that counts.

Have a wonderful sunday anyway wherever you are on our tiny planet !


You forget something, the origin of the problem is because they use the resources of SETI and try to run AP on that, so if they keep the focus on the main project Finding E.T. nothing like that could be happening.

The is crystal clear, the actual configurarion (founds, staff, bandwith, server, router, who else, makes no diference) can´t handle the 2 projects runing at full capacity at the same time something must be done quickly.
____________

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8755
Credit: 52,704,672
RAC: 29,834
United Kingdom
Message 1307369 - Posted: 18 Nov 2012, 14:33:32 UTC - in response to Message 1307367.

some people have bashed the project so many times I wonder what they are in for ? Is it a competion who has the highest RAC or are there some people still here who know the core surpose of this project :

Finding E.T. some day in the future

If the project goes cold for a while due to circumstances beyond the control of the volunteers working at Berkeley ($$$$!) and you don't like it why don't you just move on... No hard feelings ;-)

I remember last year around this time we had a mega outage of over a month didn't we and most of you still came back...

I also have to work with limits, proxy's and a lot of "manual care" for the project, my RAC has also dropped about 4000 but it's the end result that counts.

Have a wonderful sunday anyway wherever you are on our tiny planet !

You forget something, the origin of the problem is because they use the resources of SETI and try to run AP on that, so if they keep the focus on the main project Finding E.T. nothing like that could be happening.

The is crystal clear, the actual configurarion (founds, staff, bandwith, server, router, who else, makes no diference) can´t handle the 2 projects runing at full capacity at the same time something must be done quickly.

From Astropulse FAQ:

Astropulse is a new type of SETI. It expands on the original SETI@home, but does not replace it. The original SETI@home searches for narrowband signals, as does a conventional AM or FM radio. Astropulse, on the other hand, listens for broader-band, short-time pulses.

In addition to ET, Astropulse might detect other sources, such as...

That doesn't detract from your main point, which is that the unmanaged and unregulated attempt to run both halves (of the same project) on the same inadequate infrastructure at the same time is counter-productive. We have passed the point of diminishing returns.

WinterKnight
Volunteer tester
Send message
Joined: 18 May 99
Posts: 8737
Credit: 25,592,010
RAC: 13,895
United Kingdom
Message 1307371 - Posted: 18 Nov 2012, 14:42:04 UTC - in response to Message 1307360.

If admins deserve "bashing" then it's this: they could literally take 3 minutes of their time and write a front page news post, mentioning timeout/network problems, so volunteers would know it's not problem on their side.
Right now, we have people posting their problems all over the place in every subforum, reinstalling boinc, rebooting, (un)plugging network cables, looking for proxies, etc...

Do you realise how many people do NOT read the front page?

It is probably in the same ballpark as people who read the manual, and that is in single percentage figures. The reason RTFM appears so often backs up that calaim.

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4332
Credit: 1,113,358
RAC: 1,079
United States
Message 1307376 - Posted: 18 Nov 2012, 14:48:19 UTC - in response to Message 1307295.

And just to make things even better how about mini work units when you do get some work. I bet that helps the scheduler overload!

The ghosts created November 4 which were shorties have now missed deadline. I've just been looking at the task list for a host which had 1091 tasks "sent" Nov. 4 yesterday, now has 792 errors for deadline missed.
Joe

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8755
Credit: 52,704,672
RAC: 29,834
United Kingdom
Message 1307379 - Posted: 18 Nov 2012, 14:52:48 UTC - in response to Message 1307376.

And just to make things even better how about mini work units when you do get some work. I bet that helps the scheduler overload!

The ghosts created November 4 which were shorties have now missed deadline. I've just been looking at the task list for a host which had 1091 tasks "sent" Nov. 4 yesterday, now has 792 errors for deadline missed.
Joe

And I'm sure I've got at least some of the resends. A very high proportion of this morning's shorties have been replication _2, _3, _4

Mark Lybeck
Send message
Joined: 9 Aug 99
Posts: 209
Credit: 105,039,931
RAC: 46,658
Finland
Message 1307383 - Posted: 18 Nov 2012, 15:00:20 UTC

Could it be that an old hub is connected to the network and someone accidentially pushed x-over / vs straight connection button on it causing Rx/Tx to loop indefinetly clogging up all available bandwidth?

Someone at SETI could run Wireshark to see what is going on there.

____________

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5467
Credit: 313,411,747
RAC: 166,235
Brazil
Message 1307384 - Posted: 18 Nov 2012, 15:01:44 UTC - in response to Message 1307369.
Last modified: 18 Nov 2012, 15:05:21 UTC


That doesn't detract from your main point, which is that the unmanaged and unregulated attempt to run both halves (of the same project) on the same inadequate infrastructure at the same time is counter-productive. We have passed the point of diminishing returns.


I agree 100% with you, and something must stay clear i have nothing against AP (i even crunch it) and don´t ask for compleate AP or MB stop, i just think, before we go to mars we need to develop the technology to stay in the space for almost a year, or the mars project will end on a disaster.

You point a possible and high confidence hypotesis for the origin of the problem, so why not make a "pit-stop" and check? is clear the actual situation goes from bad to worst as days are passing (did you notice the increase in number of pendings? Soon they will need to stop to clear them, but that is another problem for another time) Or in another perspective, if they know they can´t mantain both project running at 100%, why don´t slow both to a confortable speed (one that make less painfull to everybody obtain their WU) at least until Matt returns and realy finds/fix the problem?

Thats i call right management.

Until them i take the extreme option, activate another project, starting to cruching Einstein for a while, but realy don´t feel confortable with that. My real quest is not for RAC, i realy whant to help to find out little green ET friend (or enemy who knows?).
____________

fkjgsklfjgisojfg
Send message
Joined: 12 Jan 12
Posts: 1
Credit: 723,194
RAC: 616
Message 1307404 - Posted: 18 Nov 2012, 15:40:43 UTC

THE ALIENS HAVE CRASHED MY COMPUTER!!!!!!!!!!!!!!!!!!!

WezH
Volunteer tester
Send message
Joined: 19 Aug 99
Posts: 186
Credit: 5,207,478
RAC: 24,466
Finland
Message 1307429 - Posted: 18 Nov 2012, 16:25:30 UTC

Thunder is out of work (again)...

18/11/2012 04:26:21 | SETI@home | Scheduler request failed: Timeout was reached 18/11/2012 07:18:15 | SETI@home | Scheduler request failed: Timeout was reached 18/11/2012 10:39:04 | SETI@home | Scheduler request failed: Timeout was reached 18/11/2012 10:46:17 | SETI@home | Scheduler request failed: Timeout was reached 18/11/2012 10:53:06 | SETI@home | Scheduler request failed: Timeout was reached 18/11/2012 10:58:21 | SETI@home | Scheduler request completed: got 20 new tasks 18/11/2012 11:09:07 | SETI@home | Scheduler request failed: Timeout was reached 18/11/2012 11:37:09 | SETI@home | Scheduler request failed: Timeout was reached 18/11/2012 11:46:46 | SETI@home | Scheduler request failed: Timeout was reached 18/11/2012 11:59:50 | SETI@home | Scheduler request failed: Timeout was reached 18/11/2012 12:23:44 | SETI@home | Scheduler request failed: Timeout was reached 18/11/2012 12:52:56 | SETI@home | Scheduler request failed: Timeout was reached 18/11/2012 13:54:27 | SETI@home | Scheduler request failed: Timeout was reached 18/11/2012 15:11:15 | SETI@home | Scheduler request failed: Timeout was reached 18/11/2012 17:55:34 | SETI@home | Scheduler request failed: Timeout was reached


In 13 hours, only 1 request out of 15 completed. In that completed request 51 completed tasks were reported and those 20 lost tasks were sent.


____________
"Please keep Your signature under four lines so Internet traffic doesn't go up too much"

- In 1992 when I had my first e-mail address -

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4205
Credit: 34,457,114
RAC: 21,389
United Kingdom
Message 1307434 - Posted: 18 Nov 2012, 16:42:28 UTC - in response to Message 1307429.
Last modified: 18 Nov 2012, 16:45:32 UTC

Thunder is out of work (again)...

In 13 hours, only 1 request out of 15 completed. In that completed request 51 completed tasks were reported and those 20 lost tasks were sent.

You can increase your chances of getting work by setting No New Tasks, getting your completed tasks reported, then set a small enough cache so only one or two tasks are resent at once, then scheduler requests are more likely to get through,
increase cache level in 0.1 day increments as work is resent.

Or use a proxy.

Claggy

WezH
Volunteer tester
Send message
Joined: 19 Aug 99
Posts: 186
Credit: 5,207,478
RAC: 24,466
Finland
Message 1307436 - Posted: 18 Nov 2012, 16:50:57 UTC - in response to Message 1307434.

Thunder is out of work (again)...

In 13 hours, only 1 request out of 15 completed. In that completed request 51 completed tasks were reported and those 20 lost tasks were sent.

You can increase your chances of getting work by setting No New Tasks, getting your completed tasks reported, then set a small enough cache so only one or two tasks are resent at once, then scheduler requests are more likely to get through,
increase cache level in 0.1 day increments as work is resent.

Claggy


Yep, I know that but as I did say:

All right, all my active crunchers are now in mode "Set & Forget"

No "No New Tasks", proxies, manually update. No any kind of user activity.

"Install and forget"....


This is just a test to see what happens without user action, and we are seeing results.

1 · 2 · 3 · 4 . . . 23 · Next

Message boards : Number crunching : Panic Mode On (79) Server Problems?

Copyright © 2014 University of California