Panic Mode On (80) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (80) Server Problems?

Previous · 1 . . . 19 · 20 · 21 · 22 · 23 · 24 · 25 · Next
Author Message
Profile Donald L. JohnsonProject donor
Avatar
Send message
Joined: 5 Aug 02
Posts: 6123
Credit: 673,887
RAC: 1,181
United States
Message 1331898 - Posted: 27 Jan 2013, 5:54:27 UTC - in response to Message 1331868.
Last modified: 27 Jan 2013, 5:55:20 UTC

EDIT- and to add to the Scheduler problems, and the AP validators falling behind for alsmot a week now, the AP assimilators appear to be not working either; that backlog is increasing at a rapid rate.

SSP shows all the AP Assimilators as DISABLED - turned off.
I think they have been that way all weekend, if not all week. I have an AP task that validated on Wednesday 1/23 that has not yet been purged from my account page.
____________
Donald
Infernal Optimist / Submariner, retired

Profile CLYDEProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Aug 99
Posts: 1877
Credit: 22,226,713
RAC: 36,066
United States
Message 1331921 - Posted: 27 Jan 2013, 7:41:23 UTC - in response to Message 1331846.
Last modified: 27 Jan 2013, 7:42:28 UTC

That`s it, I`m done.....going to dump all the wu`s and go to the other project. I thought Seti finally got their crap together, but apparently I was wrong!!!! Goodbye Seti

MusicGod...

OK
____________

Profile Donald L. JohnsonProject donor
Avatar
Send message
Joined: 5 Aug 02
Posts: 6123
Credit: 673,887
RAC: 1,181
United States
Message 1331925 - Posted: 27 Jan 2013, 8:20:51 UTC - in response to Message 1331846.

That`s it, I`m done.....going to dump all the wu`s and go to the other project. I thought Seti finally got their crap together, but apparently I was wrong!!!! Goodbye Seti

I understand the frustration, but really...
To quit after over 10 years, and to throw a temper tantrum on the way out...

Hope he has more success and satisfaction on the other project.
____________
Donald
Infernal Optimist / Submariner, retired

Profile Siran d'Vel'nahr
Volunteer tester
Avatar
Send message
Joined: 23 May 99
Posts: 5686
Credit: 4,647,122
RAC: 2,949
United States
Message 1331979 - Posted: 27 Jan 2013, 13:45:23 UTC

Greetings,

Ok, here's my thing:

The way I understood what BOINC was, back in 04, was that it was an application to 'set-n-forget', right? Right. Well, since I re-started crunching SETI in December or November last year, BOINC has been anything but 'set-n-forget'. Observe:

WUs do not download to my PC automagically. Uploads do go automagically. Finished WUs do not report automagically (please refer to first statement in this paragraph). The only way for the finished WUs to be reported and to download new WUs is to manually hit the "Update" button. I will report anywhere from 10 to 50, or more, WUs and download an equal number after hitting the button, scheduler cooperation notwithstanding.

This to me is not very 'set-n-forget'. :(

I am also running FreeHAL@home and Asteroids@home as backup for when SETI is hosed for whatever reason. FreeHAL runs automagically, Asteroids does sometimes. I do manually report Asteroids on occasion, but I do seem to download automagically. FreeHAL is truly automagic, I very, very seldom ever have to manually update FreeHAL.

With all the above said, it does seem to me to be a software 'problem' with SETI. My better half's BOINC is doing the same thing on her PC. More on her's below...

Oh, and here's another thing too. When I report, my Event log tells me that it is reporting so many WUs and requesting more. My better half's BOINC is not doing the same. Her log will say that BOINC is not requesting more WUs. Every time. She is lucky to get WUs at all. I just built her PC which is a newer generation Intel i7 3770 than mine is. Even with on board video she can out do my RAC, when she gets the work that is. I have compared her BOINC settings to mine and they are virtually identical. What gives?

By the way, I agree with Mark, SETI needs to loosen their restrictions on us.

Keep on BOINCing...! :)

____________
CAPT Siran d'Vel'nahr XO
USS Vre'kasht NCC-33187

Siran's website: [ ONLINE! ]

fscheel
Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1331980 - Posted: 27 Jan 2013, 13:56:22 UTC

Is anyone successfully using a proxy? I was getting good results until 3 days ago. Now I cannot connect at all with a proxy and have tried a lot of different ones.

Frank

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23789
Credit: 32,586,712
RAC: 24,087
Germany
Message 1331981 - Posted: 27 Jan 2013, 14:00:19 UTC - in response to Message 1331980.

Is anyone successfully using a proxy? I was getting good results until 3 days ago. Now I cannot connect at all with a proxy and have tried a lot of different ones.

Frank


Sometimes it works with a proxy sometimes it doesn`t.
Lots of baby sitting tho.

____________

cdemers
Volunteer tester
Send message
Joined: 18 May 99
Posts: 29
Credit: 15,997,563
RAC: 762
Canada
Message 1331983 - Posted: 27 Jan 2013, 14:07:45 UTC

Things are a bit frustrating right now, thinking the project needs to setup a second scheduler on another server and load balance between them. If i remember correctly BOINC does support doing this.

____________

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23789
Credit: 32,586,712
RAC: 24,087
Germany
Message 1331986 - Posted: 27 Jan 2013, 14:31:30 UTC - in response to Message 1331983.

Things are a bit frustrating right now, thinking the project needs to setup a second scheduler on another server and load balance between them. If i remember correctly BOINC does support doing this.


Doesn`t resolve the bandwidth issue.

____________

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4067
Credit: 32,880,038
RAC: 6,844
United Kingdom
Message 1331988 - Posted: 27 Jan 2013, 14:36:50 UTC - in response to Message 1331979.

Greetings,

Ok, here's my thing:

The way I understood what BOINC was, back in 04, was that it was an application to 'set-n-forget', right? Right. Well, since I re-started crunching SETI in December or November last year, BOINC has been anything but 'set-n-forget'. Observe:

WUs do not download to my PC automagically. Uploads do go automagically. Finished WUs do not report automagically (please refer to first statement in this paragraph). The only way for the finished WUs to be reported and to download new WUs is to manually hit the "Update" button. I will report anywhere from 10 to 50, or more, WUs and download an equal number after hitting the button, scheduler cooperation notwithstanding.

This to me is not very 'set-n-forget'. :(

What cache settings are you using? Sounds as if you're still running Boinc 6 Cache settings.

With Boinc 7 it'll wait until it's below the 'Minimum work buffer' ('Maintain enough tasks to keep busy for at least' on the Setiathome computing preferences page) before asking for work again to save on scheduler contacts.

Claggy

Profile Cliff HardingProject donor
Volunteer tester
Avatar
Send message
Joined: 18 Aug 99
Posts: 964
Credit: 50,769,627
RAC: 41,275
United States
Message 1331996 - Posted: 27 Jan 2013, 14:55:10 UTC

I must be one of the extremely lucky ones as I have 50 D/Ls that have been trying to come down the pike over night. I am now force feeding the machine to get them on board. My fastest machine is down to 42 tasks and that won't last through the day unless I suspend processing for a couple of hours and try to connect again.
____________


I don't buy computers, I build them!!

Profile Siran d'Vel'nahr
Volunteer tester
Avatar
Send message
Joined: 23 May 99
Posts: 5686
Credit: 4,647,122
RAC: 2,949
United States
Message 1331997 - Posted: 27 Jan 2013, 15:04:36 UTC - in response to Message 1331988.

Greetings,

Ok, here's my thing:

-[ snip ]-

This to me is not very 'set-n-forget'. :(

What cache settings are you using? Sounds as if you're still running Boinc 6 Cache settings.

With Boinc 7 it'll wait until it's below the 'Minimum work buffer' ('Maintain enough tasks to keep busy for at least' on the Setiathome computing preferences page) before asking for work again to save on scheduler contacts.

Claggy

Greetings Claggy,

Ok, I changed my minimum setting. For whatever reason it was set to 0.1, it's been there for like, forever. :/ I changed it to 5. So, that should change the way BOINC communicates with the server(s). Thanks! :)

Keep on BOINCing...! :)

____________
CAPT Siran d'Vel'nahr XO
USS Vre'kasht NCC-33187

Siran's website: [ ONLINE! ]

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4067
Credit: 32,880,038
RAC: 6,844
United Kingdom
Message 1331998 - Posted: 27 Jan 2013, 15:14:27 UTC - in response to Message 1331997.

Greetings,

Ok, here's my thing:

-[ snip ]-

This to me is not very 'set-n-forget'. :(

What cache settings are you using? Sounds as if you're still running Boinc 6 Cache settings.

With Boinc 7 it'll wait until it's below the 'Minimum work buffer' ('Maintain enough tasks to keep busy for at least' on the Setiathome computing preferences page) before asking for work again to save on scheduler contacts.

Claggy

Greetings Claggy,

Ok, I changed my minimum setting. For whatever reason it was set to 0.1, it's been there for like, forever. :/ I changed it to 5. So, that should change the way BOINC communicates with the server(s). Thanks! :)

Keep on BOINCing...! :)

Make sure you also set the '... and up to an additional' setting to a low value, say 0.01,
If you have cache setting of 5 + 5 days, Boinc will fill up to 10 days work, then not ask again until it drops below 5 days,

Claggy

Profile Fred E.Project donor
Volunteer tester
Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,139,004
RAC: 233
United States
Message 1331999 - Posted: 27 Jan 2013, 15:18:02 UTC

I must be one of the extremely lucky ones as I have 50 D/Ls that have been trying to come down the pike over night. I am now force feeding the machine to get them on board.

I also had some overnight luck and got up to the obsolete limit for gpu and I'm working on the cpu limit. Downloads are slow but are coming through w/o much intervention. Just clear the retries when I want to ask for work. Have had half a dozen successful connects this AM. That's more than all day yesterday. Not using a proxy.
____________
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.

Profile Siran d'Vel'nahr
Volunteer tester
Avatar
Send message
Joined: 23 May 99
Posts: 5686
Credit: 4,647,122
RAC: 2,949
United States
Message 1332001 - Posted: 27 Jan 2013, 15:22:26 UTC - in response to Message 1331998.

Greetings,

Ok, here's my thing:

-[ snip ]-

This to me is not very 'set-n-forget'. :(

-[ snip ]-

Claggy

Greetings Claggy,

Ok, I changed my minimum setting. For whatever reason it was set to 0.1, it's been there for like, forever. :/ I changed it to 5. So, that should change the way BOINC communicates with the server(s). Thanks! :)

Keep on BOINCing...! :)

Make sure you also set the '... and up to an additional' setting to a low value, say 0.01,
If you have cache setting of 5 + 5 days, Boinc will fill up to 10 days work, then not ask again until it drops below 5 days,

Claggy

Greetings Claggy,

Ok, will do. It is set to 4 days right now, I will lower that. Thanks. :)

Keep on BOINCing...! :)


____________
CAPT Siran d'Vel'nahr XO
USS Vre'kasht NCC-33187

Siran's website: [ ONLINE! ]

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8301
Credit: 55,114,648
RAC: 74,642
United Kingdom
Message 1332012 - Posted: 27 Jan 2013, 17:06:10 UTC

I've been getting loads of resends all day on my newest cruncher.
They are taking ages to get through, but the number of tasks visibly downloading or resident is now nearer the number that the website thinks are "resident" on that cruncher, so we might be getting towards the end of a retry storm (of shorties)
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2245
Credit: 8,589,438
RAC: 4,316
United States
Message 1332095 - Posted: 27 Jan 2013, 21:05:38 UTC

My take on downloads not going and scheduler requests not happening due to project back-off is that up until a few months ago, I was still using the last pre-GPU build of BOINC (6.2.19). For that build, there was no project back-off. Every file transfer has its own back-off counter, and if they were all waiting for a re-try, scheduler would not happen automatically, but as soon as one counter reached zero, scheduler request would go out.

Also, the maximum back-off was 3:59:59. I had nearly zero issues with getting work or reporting finished work with that.

Since switching to a more modern (but still very outdated) build, it requires a lot of baby-sitting, because one download will stall, and next thing you know, you're in total lock-down for 18 hours unless you start pressing buttons. I never had to babysit 6.2.19. That project back-off scheme, and a back-off interval of more than 4 hours ruined everything.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8301
Credit: 55,114,648
RAC: 74,642
United Kingdom
Message 1332098 - Posted: 27 Jan 2013, 21:46:50 UTC

The more modern BOINC versions are very bad at using very long delays.
Given that the top crunchers are capable of rattling through tasks at over 1 per minute these big delays are counter productive. By the time the delay has worked its way trough there are several more tasks to be reported, and more tasks being demanded. Short delays, and clearing out part-downloads in preference to un-started would all help get rid of the backlog, probably reduce the re-try rate and so ease the burden on the servers. As it is it's I would say the scheduler is set up in such a way as to generate queues not prevent them.
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5791
Credit: 57,940,927
RAC: 47,771
Australia
Message 1332099 - Posted: 27 Jan 2013, 21:53:53 UTC - in response to Message 1331988.
Last modified: 27 Jan 2013, 21:59:44 UTC

What cache settings are you using? Sounds as if you're still running Boinc 6 Cache settings.

I'm using V6 & at the moment it is not set & forget- you have to keep on hitting retry when the Scheduler is borked, or disable & re-enable network access when the Scheduler is working but the network trffic is maxed out. Not spending all day at the computer means i've been running out of work quite regularly over the last couple of weeks.

Although it does appear the Scheduler came back to life a few hours agao, so now many of the downloads take 10min or more (just for MB).
____________
Grant
Darwin NT.

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5791
Credit: 57,940,927
RAC: 47,771
Australia
Message 1332101 - Posted: 27 Jan 2013, 21:59:03 UTC - in response to Message 1331986.

Things are a bit frustrating right now, thinking the project needs to setup a second scheduler on another server and load balance between them. If i remember correctly BOINC does support doing this.


Doesn`t resolve the bandwidth issue.

I think that, as bad as the servers have been, is probably the biggest issue. Even when the servers are working, work is damn near impossible to come by.
For a while there they were using the campus network for Scheduler requests, and it was great. No problems with data from the peer, no timeouts, no problems conecting the server & no matter how many tasks you were reporting & how much work you were requesting 5 seconds was the longest it was taking to get a response. Often the responses came within 3 seconds.
____________
Grant
Darwin NT.

Tom*
Send message
Joined: 12 Aug 11
Posts: 114
Credit: 4,773,706
RAC: 14,540
United States
Message 1332114 - Posted: 27 Jan 2013, 22:45:45 UTC - in response to Message 1332101.

Grant

The last time they used the campus network for schedular requests

Ths is what happened.

The scheduler will be down until someone can get to the lab to reboot it. I'll try to convince Angela to let me go in once the turkey is in the oven.

Eric

Previous · 1 . . . 19 · 20 · 21 · 22 · 23 · 24 · 25 · Next

Message boards : Number crunching : Panic Mode On (80) Server Problems?

Copyright © 2014 University of California