Panic Mode On (80) Server Problems?

Message boards : Number crunching : Panic Mode On (80) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · 24 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13947
Credit: 208,696,464
RAC: 304
Australia
Message 1331868 - Posted: 27 Jan 2013, 3:37:46 UTC - in response to Message 1331866.  
Last modified: 27 Jan 2013, 3:40:35 UTC

But the status page shows lots of work, and the cricket graph seems to show both up and downloading.

If you look closely at the inbound traffic, you'll notice it's only about 10Mb/s, normally if things are working it's around 14-16Mb/s.
The lack of inbound traffic is due to the problems contacting the Scheduler.


EDIT- and to edd to the Scheduler problems, and the AP validators falling behind for alsmot a week now, the AP assimilators appear to be not working either; that backlog is increasing at a rapid rate.
Grant
Darwin NT
ID: 1331868 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1331898 - Posted: 27 Jan 2013, 5:54:27 UTC - in response to Message 1331868.  
Last modified: 27 Jan 2013, 5:55:20 UTC

EDIT- and to add to the Scheduler problems, and the AP validators falling behind for alsmot a week now, the AP assimilators appear to be not working either; that backlog is increasing at a rapid rate.

SSP shows all the AP Assimilators as DISABLED - turned off.
I think they have been that way all weekend, if not all week. I have an AP task that validated on Wednesday 1/23 that has not yet been purged from my account page.
Donald
Infernal Optimist / Submariner, retired
ID: 1331898 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1331925 - Posted: 27 Jan 2013, 8:20:51 UTC - in response to Message 1331846.  

That`s it, I`m done.....going to dump all the wu`s and go to the other project. I thought Seti finally got their crap together, but apparently I was wrong!!!! Goodbye Seti

I understand the frustration, but really...
To quit after over 10 years, and to throw a temper tantrum on the way out...

Hope he has more success and satisfaction on the other project.
Donald
Infernal Optimist / Submariner, retired
ID: 1331925 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7381
Credit: 44,181,323
RAC: 238
United States
Message 1331979 - Posted: 27 Jan 2013, 13:45:23 UTC

Greetings,

Ok, here's my thing:

The way I understood what BOINC was, back in 04, was that it was an application to 'set-n-forget', right? Right. Well, since I re-started crunching SETI in December or November last year, BOINC has been anything but 'set-n-forget'. Observe:

WUs do not download to my PC automagically. Uploads do go automagically. Finished WUs do not report automagically (please refer to first statement in this paragraph). The only way for the finished WUs to be reported and to download new WUs is to manually hit the "Update" button. I will report anywhere from 10 to 50, or more, WUs and download an equal number after hitting the button, scheduler cooperation notwithstanding.

This to me is not very 'set-n-forget'. :(

I am also running FreeHAL@home and Asteroids@home as backup for when SETI is hosed for whatever reason. FreeHAL runs automagically, Asteroids does sometimes. I do manually report Asteroids on occasion, but I do seem to download automagically. FreeHAL is truly automagic, I very, very seldom ever have to manually update FreeHAL.

With all the above said, it does seem to me to be a software 'problem' with SETI. My better half's BOINC is doing the same thing on her PC. More on her's below...

Oh, and here's another thing too. When I report, my Event log tells me that it is reporting so many WUs and requesting more. My better half's BOINC is not doing the same. Her log will say that BOINC is not requesting more WUs. Every time. She is lucky to get WUs at all. I just built her PC which is a newer generation Intel i7 3770 than mine is. Even with on board video she can out do my RAC, when she gets the work that is. I have compared her BOINC settings to mine and they are virtually identical. What gives?

By the way, I agree with Mark, SETI needs to loosen their restrictions on us.

Keep on BOINCing...! :)

CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 1331979 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1331980 - Posted: 27 Jan 2013, 13:56:22 UTC

Is anyone successfully using a proxy? I was getting good results until 3 days ago. Now I cannot connect at all with a proxy and have tried a lot of different ones.

Frank
ID: 1331980 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34563
Credit: 79,922,639
RAC: 80
Germany
Message 1331981 - Posted: 27 Jan 2013, 14:00:19 UTC - in response to Message 1331980.  

Is anyone successfully using a proxy? I was getting good results until 3 days ago. Now I cannot connect at all with a proxy and have tried a lot of different ones.

Frank


Sometimes it works with a proxy sometimes it doesn`t.
Lots of baby sitting tho.

With each crime and every kindness we birth our future.
ID: 1331981 · Report as offensive
cdemers
Volunteer tester

Send message
Joined: 18 May 99
Posts: 30
Credit: 17,235,002
RAC: 0
Canada
Message 1331983 - Posted: 27 Jan 2013, 14:07:45 UTC

Things are a bit frustrating right now, thinking the project needs to setup a second scheduler on another server and load balance between them. If i remember correctly BOINC does support doing this.

ID: 1331983 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34563
Credit: 79,922,639
RAC: 80
Germany
Message 1331986 - Posted: 27 Jan 2013, 14:31:30 UTC - in response to Message 1331983.  

Things are a bit frustrating right now, thinking the project needs to setup a second scheduler on another server and load balance between them. If i remember correctly BOINC does support doing this.


Doesn`t resolve the bandwidth issue.

With each crime and every kindness we birth our future.
ID: 1331986 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1331988 - Posted: 27 Jan 2013, 14:36:50 UTC - in response to Message 1331979.  

Greetings,

Ok, here's my thing:

The way I understood what BOINC was, back in 04, was that it was an application to 'set-n-forget', right? Right. Well, since I re-started crunching SETI in December or November last year, BOINC has been anything but 'set-n-forget'. Observe:

WUs do not download to my PC automagically. Uploads do go automagically. Finished WUs do not report automagically (please refer to first statement in this paragraph). The only way for the finished WUs to be reported and to download new WUs is to manually hit the "Update" button. I will report anywhere from 10 to 50, or more, WUs and download an equal number after hitting the button, scheduler cooperation notwithstanding.

This to me is not very 'set-n-forget'. :(

What cache settings are you using? Sounds as if you're still running Boinc 6 Cache settings.

With Boinc 7 it'll wait until it's below the 'Minimum work buffer' ('Maintain enough tasks to keep busy for at least' on the Setiathome computing preferences page) before asking for work again to save on scheduler contacts.

Claggy
ID: 1331988 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1331996 - Posted: 27 Jan 2013, 14:55:10 UTC

I must be one of the extremely lucky ones as I have 50 D/Ls that have been trying to come down the pike over night. I am now force feeding the machine to get them on board. My fastest machine is down to 42 tasks and that won't last through the day unless I suspend processing for a couple of hours and try to connect again.


I don't buy computers, I build them!!
ID: 1331996 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7381
Credit: 44,181,323
RAC: 238
United States
Message 1331997 - Posted: 27 Jan 2013, 15:04:36 UTC - in response to Message 1331988.  

Greetings,

Ok, here's my thing:

-[ snip ]-

This to me is not very 'set-n-forget'. :(

What cache settings are you using? Sounds as if you're still running Boinc 6 Cache settings.

With Boinc 7 it'll wait until it's below the 'Minimum work buffer' ('Maintain enough tasks to keep busy for at least' on the Setiathome computing preferences page) before asking for work again to save on scheduler contacts.

Claggy

Greetings Claggy,

Ok, I changed my minimum setting. For whatever reason it was set to 0.1, it's been there for like, forever. :/ I changed it to 5. So, that should change the way BOINC communicates with the server(s). Thanks! :)

Keep on BOINCing...! :)

CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 1331997 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1331998 - Posted: 27 Jan 2013, 15:14:27 UTC - in response to Message 1331997.  

Greetings,

Ok, here's my thing:

-[ snip ]-

This to me is not very 'set-n-forget'. :(

What cache settings are you using? Sounds as if you're still running Boinc 6 Cache settings.

With Boinc 7 it'll wait until it's below the 'Minimum work buffer' ('Maintain enough tasks to keep busy for at least' on the Setiathome computing preferences page) before asking for work again to save on scheduler contacts.

Claggy

Greetings Claggy,

Ok, I changed my minimum setting. For whatever reason it was set to 0.1, it's been there for like, forever. :/ I changed it to 5. So, that should change the way BOINC communicates with the server(s). Thanks! :)

Keep on BOINCing...! :)

Make sure you also set the '... and up to an additional' setting to a low value, say 0.01,
If you have cache setting of 5 + 5 days, Boinc will fill up to 10 days work, then not ask again until it drops below 5 days,

Claggy
ID: 1331998 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1331999 - Posted: 27 Jan 2013, 15:18:02 UTC

I must be one of the extremely lucky ones as I have 50 D/Ls that have been trying to come down the pike over night. I am now force feeding the machine to get them on board.

I also had some overnight luck and got up to the obsolete limit for gpu and I'm working on the cpu limit. Downloads are slow but are coming through w/o much intervention. Just clear the retries when I want to ask for work. Have had half a dozen successful connects this AM. That's more than all day yesterday. Not using a proxy.
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1331999 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7381
Credit: 44,181,323
RAC: 238
United States
Message 1332001 - Posted: 27 Jan 2013, 15:22:26 UTC - in response to Message 1331998.  

Greetings,

Ok, here's my thing:

-[ snip ]-

This to me is not very 'set-n-forget'. :(

-[ snip ]-

Claggy

Greetings Claggy,

Ok, I changed my minimum setting. For whatever reason it was set to 0.1, it's been there for like, forever. :/ I changed it to 5. So, that should change the way BOINC communicates with the server(s). Thanks! :)

Keep on BOINCing...! :)

Make sure you also set the '... and up to an additional' setting to a low value, say 0.01,
If you have cache setting of 5 + 5 days, Boinc will fill up to 10 days work, then not ask again until it drops below 5 days,

Claggy

Greetings Claggy,

Ok, will do. It is set to 4 days right now, I will lower that. Thanks. :)

Keep on BOINCing...! :)


CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 1332001 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22800
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1332012 - Posted: 27 Jan 2013, 17:06:10 UTC

I've been getting loads of resends all day on my newest cruncher.
They are taking ages to get through, but the number of tasks visibly downloading or resident is now nearer the number that the website thinks are "resident" on that cruncher, so we might be getting towards the end of a retry storm (of shorties)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1332012 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1332095 - Posted: 27 Jan 2013, 21:05:38 UTC

My take on downloads not going and scheduler requests not happening due to project back-off is that up until a few months ago, I was still using the last pre-GPU build of BOINC (6.2.19). For that build, there was no project back-off. Every file transfer has its own back-off counter, and if they were all waiting for a re-try, scheduler would not happen automatically, but as soon as one counter reached zero, scheduler request would go out.

Also, the maximum back-off was 3:59:59. I had nearly zero issues with getting work or reporting finished work with that.

Since switching to a more modern (but still very outdated) build, it requires a lot of baby-sitting, because one download will stall, and next thing you know, you're in total lock-down for 18 hours unless you start pressing buttons. I never had to babysit 6.2.19. That project back-off scheme, and a back-off interval of more than 4 hours ruined everything.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1332095 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22800
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1332098 - Posted: 27 Jan 2013, 21:46:50 UTC

The more modern BOINC versions are very bad at using very long delays.
Given that the top crunchers are capable of rattling through tasks at over 1 per minute these big delays are counter productive. By the time the delay has worked its way trough there are several more tasks to be reported, and more tasks being demanded. Short delays, and clearing out part-downloads in preference to un-started would all help get rid of the backlog, probably reduce the re-try rate and so ease the burden on the servers. As it is it's I would say the scheduler is set up in such a way as to generate queues not prevent them.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1332098 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13947
Credit: 208,696,464
RAC: 304
Australia
Message 1332099 - Posted: 27 Jan 2013, 21:53:53 UTC - in response to Message 1331988.  
Last modified: 27 Jan 2013, 21:59:44 UTC

What cache settings are you using? Sounds as if you're still running Boinc 6 Cache settings.

I'm using V6 & at the moment it is not set & forget- you have to keep on hitting retry when the Scheduler is borked, or disable & re-enable network access when the Scheduler is working but the network trffic is maxed out. Not spending all day at the computer means i've been running out of work quite regularly over the last couple of weeks.

Although it does appear the Scheduler came back to life a few hours agao, so now many of the downloads take 10min or more (just for MB).
Grant
Darwin NT
ID: 1332099 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13947
Credit: 208,696,464
RAC: 304
Australia
Message 1332101 - Posted: 27 Jan 2013, 21:59:03 UTC - in response to Message 1331986.  

Things are a bit frustrating right now, thinking the project needs to setup a second scheduler on another server and load balance between them. If i remember correctly BOINC does support doing this.


Doesn`t resolve the bandwidth issue.

I think that, as bad as the servers have been, is probably the biggest issue. Even when the servers are working, work is damn near impossible to come by.
For a while there they were using the campus network for Scheduler requests, and it was great. No problems with data from the peer, no timeouts, no problems conecting the server & no matter how many tasks you were reporting & how much work you were requesting 5 seconds was the longest it was taking to get a response. Often the responses came within 3 seconds.
Grant
Darwin NT
ID: 1332101 · Report as offensive
Tom*

Send message
Joined: 12 Aug 11
Posts: 127
Credit: 20,769,223
RAC: 9
United States
Message 1332114 - Posted: 27 Jan 2013, 22:45:45 UTC - in response to Message 1332101.  

Grant

The last time they used the campus network for schedular requests

Ths is what happened.

The scheduler will be down until someone can get to the lab to reboot it. I'll try to convince Angela to let me go in once the turkey is in the oven.

Eric
ID: 1332114 · Report as offensive
Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · 24 · Next

Message boards : Number crunching : Panic Mode On (80) Server Problems?


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.