Panic Mode On (80) Server Problems?

Message boards : Number crunching : Panic Mode On (80) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · 24 . . . 25 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1331708 - Posted: 26 Jan 2013, 21:46:35 UTC - in response to Message 1331707.  


One system out of GPU work, the other soon to be out of GPU & CPU work.
Scheduler still borked.
Grant
Darwin NT
ID: 1331708 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1331715 - Posted: 26 Jan 2013, 21:55:52 UTC - in response to Message 1331708.  


One system out of GPU work, the other soon to be out of GPU & CPU work.
Scheduler still borked.

Again? Tell something new...

ID: 1331715 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1331750 - Posted: 26 Jan 2013, 22:45:36 UTC - in response to Message 1331715.  


One system out of GPU work, the other soon to be out of GPU & CPU work.
Scheduler still borked.

Again? Tell something new...

It sometimes will timeout, it's not all "Couldn't contact Scheduler" messages.
Grant
Darwin NT
ID: 1331750 · Report as offensive
chromespringer
Avatar

Send message
Joined: 3 Dec 05
Posts: 296
Credit: 55,183,482
RAC: 0
United States
Message 1331759 - Posted: 26 Jan 2013, 23:01:29 UTC - in response to Message 1331715.  


One system out of GPU work, the other soon to be out of GPU & CPU work.
Scheduler still borked.

Again? Tell something new...

1/26/2013 3:56:46 PM | | Project communication failed: attempting access to reference site
1/26/2013 3:56:47 PM | | Internet access OK - project servers may be temporarily down.

Oh, this isn't new .. never mind :-)
ID: 1331759 · Report as offensive
Profile Michael W.F. Miles
Avatar

Send message
Joined: 24 Mar 07
Posts: 268
Credit: 34,410,870
RAC: 0
Canada
Message 1331760 - Posted: 26 Jan 2013, 23:02:19 UTC - in response to Message 1331589.  

MSattler says

The limits have not accomplished much.....current situation proves that.




I could not agree more. The one thing the limits have done is piss every one off and a lot have again gone to other projects.

I will say it again even though the last time I said it I got really bad reactions. As a temp measure suspend new accounts until this major problem gets ironed out.
I understand what that means but something has to break other than our patience and the servers.

Don't worry though. Piss enough people off and the membership list will go down all by itself and that is not the way to find other life forms in our Multiverses


If the limits where raised up to normal then we would not have to contact the servers so much.
As it is now with just my machine 2 times a day I have fill my cache of 200
If we got to have our 10 day cache then we would only need to contact the servers once or twice a week

That makes sense to me but what do I know

Michael Miles


ID: 1331760 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1331770 - Posted: 26 Jan 2013, 23:32:15 UTC - in response to Message 1331760.  

Things will come good again, eventually, but I'm not going to get panicked about it as I'll just let my backup projects fight it out for a bit of my hardware time.

Cheers.
ID: 1331770 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1331786 - Posted: 27 Jan 2013, 0:00:10 UTC

It's The Twilight Zone. I haven't been able to connect reliably for days. When I could connect all, I mostly would only receive CUDA MBs. Of course I finally ran out of ATI APs. I then tried to switch to ATI MBs, but, they all erred with;
26-Jan-2013 16:40:09 [SETI@home] Task 16dc12aa.22859.9985.8.10.160_1 exited with zero status but no 'finished' file
26-Jan-2013 16:40:09 [SETI@home] If this happens repeatedly you may need to reset the project.
26-Jan-2013 16:40:09 [SETI@home] Task 16dc12aa.22859.9985.8.10.55_1 exited with zero status but no 'finished' file
26-Jan-2013 16:40:09 [SETI@home] If this happens repeatedly you may need to reset the project....
They worked fine before.

I finally gave up and hit the Reset Button.

First thing I got was 20 'lost files', all ATI APs. I'm now up to 56 'LOST' ATI APs. The other files I had when I hit the Reset Button haven't even started downloading yet.
ID: 1331786 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1331788 - Posted: 27 Jan 2013, 0:04:39 UTC - in response to Message 1331760.  

MSattler says

The limits have not accomplished much.....current situation proves that.




I could not agree more. The one thing the limits have done is piss every one off and a lot have again gone to other projects.


Not everyone. The limits haven't pissed me off
and I bet there is a bunch of people that don't
even know that there are limits in place.

I will say it again even though the last time I said it I got really bad reactions. As a temp measure suspend new accounts until this major problem gets ironed out.
I understand what that means but something has to break other than our patience and the servers.

Don't worry though. Piss enough people off and the membership list will go down all by itself and that is not the way to find other life forms in our Multiverses


Until such time that the number of people
crunching drops to the point where bandwidth
drops to 80% of maximum throughput, I don't
think anyone at Berkeley will really be concerned
about it.


If the limits where raised up to normal then we would not have to contact the servers so much.


I don't think so. The servers would probably be even harder to contact than they are now.


As it is now with just my machine 2 times a day I have fill my cache of 200
If we got to have our 10 day cache then we would only need to contact the servers once or twice a week

That makes sense to me but what do I know

Michael Miles



ID: 1331788 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 1331818 - Posted: 27 Jan 2013, 1:04:04 UTC
Last modified: 27 Jan 2013, 1:05:11 UTC

The scheduler is FUBAR again. I've been trying since 12 Noon Berkeley time for some GPU units to test my new GTX 660 and get the same response every time:

1/26/2013 4:38:10 PM SETI@home Scheduler request failed: Failure when receiving data from the peer


The scheduler needs to be re-booted, in all likelyhood.

It's now 5 PM Berkeley time, if anyone had been in the lab, they've probably gone home!
.

Hello, from Albany, CA!...
ID: 1331818 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1331825 - Posted: 27 Jan 2013, 1:29:03 UTC - in response to Message 1331818.  

The scheduler is FUBAR again. I've been trying since 12 Noon Berkeley time for some GPU units to test my new GTX 660 and get the same response every time:

1/26/2013 4:38:10 PM SETI@home Scheduler request failed: Failure when receiving data from the peer

At least you can contact the Scheduler.
The response most people have been getting for the last few days is "Couldn't connect to serverr" with the occasional peer error, and the even more occasional actually getting a response from the Scheduler.
Grant
Darwin NT
ID: 1331825 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1331842 - Posted: 27 Jan 2013, 1:58:44 UTC - in response to Message 1331786.  

It's The Twilight Zone. I haven't been able to connect reliably for days. When I could connect all, I mostly would only receive CUDA MBs. Of course I finally ran out of ATI APs. I then tried to switch to ATI MBs, but, they all erred with;
26-Jan-2013 16:40:09 [SETI@home] Task 16dc12aa.22859.9985.8.10.160_1 exited with zero status but no 'finished' file
26-Jan-2013 16:40:09 [SETI@home] If this happens repeatedly you may need to reset the project.
26-Jan-2013 16:40:09 [SETI@home] Task 16dc12aa.22859.9985.8.10.55_1 exited with zero status but no 'finished' file
26-Jan-2013 16:40:09 [SETI@home] If this happens repeatedly you may need to reset the project....
They worked fine before.

I finally gave up and hit the Reset Button.

First thing I got was 20 'lost files', all ATI APs. I'm now up to 56 'LOST' ATI APs. The other files I had when I hit the Reset Button haven't even started downloading yet.

I think I know where those 50+ ATI APs came from. Apparently those were the CPU APs I had when I hit the reset button. The scheduler decided to resend them as ATI APs. It also Timed-Out my Cuda MBs, that I could use, and resent the ATI MBs that I don't need now that I have 50+ ATI APs. Of course, now I have nothing for the CUDA card, or my CPUs. The ATI card is happy though...
ID: 1331842 · Report as offensive
Profile MusicGod
Avatar

Send message
Joined: 7 Dec 02
Posts: 97
Credit: 24,782,870
RAC: 0
United States
Message 1331846 - Posted: 27 Jan 2013, 2:18:42 UTC

That`s it, I`m done.....going to dump all the wu`s and go to the other project. I thought Seti finally got their crap together, but apparently I was wrong!!!! Goodbye Seti
ID: 1331846 · Report as offensive
Profile Qui-Gon
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 2940
Credit: 19,199,902
RAC: 11
United States
Message 1331866 - Posted: 27 Jan 2013, 3:25:48 UTC

I get these messages, over and over again:

1/26/2013 5:17:31 PM SETI@home Sending scheduler request: Requested by user.
1/26/2013 5:17:31 PM SETI@home Reporting 1 completed tasks, not requesting new tasks
1/26/2013 5:17:53 PM Project communication failed: attempting access to reference site
1/26/2013 5:17:53 PM SETI@home Scheduler request failed: Couldn't connect to server
1/26/2013 5:17:54 PM Internet access OK - project servers may be temporarily down.


But the status page shows lots of work, and the cricket graph seems to show both up and downloading. This is frustrating. It took hours to report my completed work, and now I can't connect even though I'm not requesting new work.
ID: 1331866 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1331868 - Posted: 27 Jan 2013, 3:37:46 UTC - in response to Message 1331866.  
Last modified: 27 Jan 2013, 3:40:35 UTC

But the status page shows lots of work, and the cricket graph seems to show both up and downloading.

If you look closely at the inbound traffic, you'll notice it's only about 10Mb/s, normally if things are working it's around 14-16Mb/s.
The lack of inbound traffic is due to the problems contacting the Scheduler.


EDIT- and to edd to the Scheduler problems, and the AP validators falling behind for alsmot a week now, the AP assimilators appear to be not working either; that backlog is increasing at a rapid rate.
Grant
Darwin NT
ID: 1331868 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1331898 - Posted: 27 Jan 2013, 5:54:27 UTC - in response to Message 1331868.  
Last modified: 27 Jan 2013, 5:55:20 UTC

EDIT- and to add to the Scheduler problems, and the AP validators falling behind for alsmot a week now, the AP assimilators appear to be not working either; that backlog is increasing at a rapid rate.

SSP shows all the AP Assimilators as DISABLED - turned off.
I think they have been that way all weekend, if not all week. I have an AP task that validated on Wednesday 1/23 that has not yet been purged from my account page.
Donald
Infernal Optimist / Submariner, retired
ID: 1331898 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1331925 - Posted: 27 Jan 2013, 8:20:51 UTC - in response to Message 1331846.  

That`s it, I`m done.....going to dump all the wu`s and go to the other project. I thought Seti finally got their crap together, but apparently I was wrong!!!! Goodbye Seti

I understand the frustration, but really...
To quit after over 10 years, and to throw a temper tantrum on the way out...

Hope he has more success and satisfaction on the other project.
Donald
Infernal Optimist / Submariner, retired
ID: 1331925 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7379
Credit: 44,181,323
RAC: 238
United States
Message 1331979 - Posted: 27 Jan 2013, 13:45:23 UTC

Greetings,

Ok, here's my thing:

The way I understood what BOINC was, back in 04, was that it was an application to 'set-n-forget', right? Right. Well, since I re-started crunching SETI in December or November last year, BOINC has been anything but 'set-n-forget'. Observe:

WUs do not download to my PC automagically. Uploads do go automagically. Finished WUs do not report automagically (please refer to first statement in this paragraph). The only way for the finished WUs to be reported and to download new WUs is to manually hit the "Update" button. I will report anywhere from 10 to 50, or more, WUs and download an equal number after hitting the button, scheduler cooperation notwithstanding.

This to me is not very 'set-n-forget'. :(

I am also running FreeHAL@home and Asteroids@home as backup for when SETI is hosed for whatever reason. FreeHAL runs automagically, Asteroids does sometimes. I do manually report Asteroids on occasion, but I do seem to download automagically. FreeHAL is truly automagic, I very, very seldom ever have to manually update FreeHAL.

With all the above said, it does seem to me to be a software 'problem' with SETI. My better half's BOINC is doing the same thing on her PC. More on her's below...

Oh, and here's another thing too. When I report, my Event log tells me that it is reporting so many WUs and requesting more. My better half's BOINC is not doing the same. Her log will say that BOINC is not requesting more WUs. Every time. She is lucky to get WUs at all. I just built her PC which is a newer generation Intel i7 3770 than mine is. Even with on board video she can out do my RAC, when she gets the work that is. I have compared her BOINC settings to mine and they are virtually identical. What gives?

By the way, I agree with Mark, SETI needs to loosen their restrictions on us.

Keep on BOINCing...! :)

CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 1331979 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1331980 - Posted: 27 Jan 2013, 13:56:22 UTC

Is anyone successfully using a proxy? I was getting good results until 3 days ago. Now I cannot connect at all with a proxy and have tried a lot of different ones.

Frank
ID: 1331980 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1331981 - Posted: 27 Jan 2013, 14:00:19 UTC - in response to Message 1331980.  

Is anyone successfully using a proxy? I was getting good results until 3 days ago. Now I cannot connect at all with a proxy and have tried a lot of different ones.

Frank


Sometimes it works with a proxy sometimes it doesn`t.
Lots of baby sitting tho.



With each crime and every kindness we birth our future.
ID: 1331981 · Report as offensive
cdemers
Volunteer tester

Send message
Joined: 18 May 99
Posts: 30
Credit: 17,235,002
RAC: 0
Canada
Message 1331983 - Posted: 27 Jan 2013, 14:07:45 UTC

Things are a bit frustrating right now, thinking the project needs to setup a second scheduler on another server and load balance between them. If i remember correctly BOINC does support doing this.

ID: 1331983 · Report as offensive
Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · 24 . . . 25 · Next

Message boards : Number crunching : Panic Mode On (80) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.