Panic Mode On (105) Server Problems?

Message boards : Number crunching : Panic Mode On (105) Server Problems?
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 34 · Next

AuthorMessage
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1851674 - Posted: 26 Feb 2017, 21:30:30 UTC

Posting and accessing the web pages has gotten remarkably slower in the last 48 hours. Everything seems to be getting bogged down somehow.

The end is near.......................
The only end that is near is the end of this doggone long thread.

Meow.


Okay, I will create a new one, I have been busy with College lately.

ID: 1851674 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1851714 - Posted: 26 Feb 2017, 23:24:29 UTC - in response to Message 1851674.  

Posting and accessing the web pages has gotten remarkably slower in the last 48 hours. Everything seems to be getting bogged down somehow.

The end is near.......................
The only end that is near is the end of this doggone long thread.

Meow.


Okay, I will create a new one, I have been busy with College lately.


. . I think Kittyman was referring to the ongoing discussion about the problems getting new work from the schedulers.

. . He seems to be doing quite OK but there are others who are definitely not.

Stephen

:)
ID: 1851714 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36363
Credit: 261,360,520
RAC: 489
Australia
Message 1851718 - Posted: 26 Feb 2017, 23:31:54 UTC

Still no problems here keeping caches full while work is available.

But I do wonder what this week's outrage has in store for us.

Cheers.
ID: 1851718 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1851752 - Posted: 27 Feb 2017, 1:58:33 UTC - in response to Message 1851674.  

Posting and accessing the web pages has gotten remarkably slower in the last 48 hours. Everything seems to be getting bogged down somehow.

The end is near.......................
The only end that is near is the end of this doggone long thread.

Meow.


Okay, I will create a new one, I have been busy with College lately.

Thanks Arkayn, I was thinking myself that this thread was getting long in the tooth and wondering when you might start a new one. Thanks. Should speed up searches in the thread.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1851752 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1851763 - Posted: 27 Feb 2017, 3:04:26 UTC

"Scotty, Get the Drain-O again. The 'bloody download pipe is clogged again!"
ID: 1851763 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1851766 - Posted: 27 Feb 2017, 3:25:04 UTC - in response to Message 1851763.  

Uggh! You are correct. Site not responding to the flip-flip in Preferences to get the servers to send out work to me.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1851766 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1851791 - Posted: 27 Feb 2017, 5:02:51 UTC - in response to Message 1851766.  
Last modified: 27 Feb 2017, 5:03:08 UTC

Uggh! You are correct. Site not responding to the flip-flip in Preferences to get the servers to send out work to me.

Giving that a go now.
Got home from work to find main system almost completely out of GPU work (about 50WU between 2 GPUs left).

I also notice that the ready-to-send buffer has reached new highs, some other problem there as well?
Grant
Darwin NT
ID: 1851791 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1851794 - Posted: 27 Feb 2017, 5:11:34 UTC - in response to Message 1851791.  

I think the RTS overrun is resends being created, but unable to send, so they are just piling up.
ID: 1851794 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1851796 - Posted: 27 Feb 2017, 5:25:27 UTC - in response to Message 1851794.  

Quarter of a million tasks ready-to-send and I just got 7. Another couple of hundred (not allowing for any more returned work) and i'll have a full cache in time for the weekly outage.
I can see long periods of no Seti work ahead...
Grant
Darwin NT
ID: 1851796 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1851802 - Posted: 27 Feb 2017, 5:45:20 UTC - in response to Message 1851791.  

Uggh! You are correct. Site not responding to the flip-flip in Preferences to get the servers to send out work to me.

Giving that a go now.
Got home from work to find main system almost completely out of GPU work (about 50WU between 2 GPUs left).

OK, things are getting silly.
System almost out of work, changed Application settings around, got 7 WUs on 1 request out of 5, none on any of the other requests. Changed them back to what they were (when the system was almost out of work), and after Updating to get the changes, got 52 WUs on the very next request. Followed by 0 on the request after that.
>:-/
Grant
Darwin NT
ID: 1851802 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1851805 - Posted: 27 Feb 2017, 5:47:13 UTC - in response to Message 1851802.  

And the forums & checking out Tasks and Web settings are being very variable in their responses. Sometimes it happens when you click on the link, other times you've got to wait up to 30 seconds.
Grant
Darwin NT
ID: 1851805 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22447
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1851806 - Posted: 27 Feb 2017, 6:15:25 UTC

RTS is over 600k, and still struggling to get tasks, even CPU tasks. Something has upset the schedulers again.
Also the board is very patchy, one moment it works very well, then it will time out.

I guess we will have to wait for the tyre kicker to return from his slumbers.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1851806 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1851807 - Posted: 27 Feb 2017, 6:23:09 UTC - in response to Message 1851806.  

Along with 2 minute+ load times for the threads, i'm now getting "Couldn't connect to server" and "HTTP service unavailable" errors to the go with the "Project has no tasks available" when I can get a response.
In the time it took to finally make this post, the Errors have become the standard response to a Scheduler request.
I think something's gotten stuck and/or out of control & it's bringing everything else down with it.

Haven't we had something like this before when the splitters got stuck "On" & the server ran out of free disk space when it got chock a block full of WUs ready-to-send?
Grant
Darwin NT
ID: 1851807 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1851812 - Posted: 27 Feb 2017, 6:44:53 UTC - in response to Message 1851807.  

And now the forums are working again, the Scheduler responding (although still with "Project has no tasks available").

The Haveland graphs are intersting, particularly the Work in progress over the last week shows a gradual decline starting a couple of days ago, then a rapid drop, several hours later a recovery for an hour or so, then another gradual decline and then another rapid drop today. There was a slight recovery, then a period of no data, and it looks like it's back to a slow decline.

There's work available, people are requesting it, but the Scheduler just won't allocate it. Whatever went wrong at the end of December, is getting worse.
Grant
Darwin NT
ID: 1851812 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1851814 - Posted: 27 Feb 2017, 6:53:04 UTC - in response to Message 1851802.  

Uggh! You are correct. Site not responding to the flip-flip in Preferences to get the servers to send out work to me.

Giving that a go now.
Got home from work to find main system almost completely out of GPU work (about 50WU between 2 GPUs left).

OK, things are getting silly.
System almost out of work, changed Application settings around, got 7 WUs on 1 request out of 5, none on any of the other requests. Changed them back to what they were (when the system was almost out of work), and after Updating to get the changes, got 52 WUs on the very next request. Followed by 0 on the request after that.
>:-/

Just changed my application settings again, and picked up 51 WUs on the next request.
Grant
Darwin NT
ID: 1851814 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1851815 - Posted: 27 Feb 2017, 7:00:56 UTC - in response to Message 1851718.  
Last modified: 27 Feb 2017, 7:02:06 UTC

Still no problems here keeping caches full while work is available.

The really weird thing is that I have the same version Manager on both of my systems. Generally it is only the faster system (i7 2 * GTX 1070s) that has issues getting work. There have been times the C2D (2* GTX 750Tis) has struggled, but only very occasionally. And even so, when it has struggled there have been times where the settings that would get it work, would stop the i7 from getting work. Change it for the i7, C2D stops getting work.
Most of the time though, the C2D has no issue getting work, regardless of the application settings. And the i7 has gone from having to change them every 8-12 hours or so, to every 2-4 days, to as often as 3-4 times a day, to now where in a matter of hours changing the settings has been necessary in order to continue to receive work.


All of this ha occurred since the late Dec issue of people who have AP selected as their preferred application, with "Accept other work" selected to pick up v8 work when there's no AP about, suddenly finding they were no longer getting any v8 work and having to specifically select the Seti@home v8 application option.
Grant
Darwin NT
ID: 1851815 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1851838 - Posted: 27 Feb 2017, 13:32:00 UTC - in response to Message 1851815.  

I see there's quite a bit of AP work going out at the moment, and the last of the current GBT files is just about done. Final channel in progress.

Interestingly the last 2 recoveries from sudden drops of v8 work-in-progress occurred at the same time as AP work being split (25th and 27th Feb, same time each day).
Causation, or just correlation?
Grant
Darwin NT
ID: 1851838 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1851843 - Posted: 27 Feb 2017, 13:53:48 UTC - in response to Message 1851838.  

Of course the s@h in progress is declining, the system is still purging out those bad ATI 8.23 workunits that were sent out. You can see it in your resent tasks.

A drop of 5-600k more and we should be back to normal.
ID: 1851843 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22447
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1851852 - Posted: 27 Feb 2017, 14:44:57 UTC

Observation/thought: - Its almost as if the servers aren't capable of recognising that the fastest crunchers are now returning several results every five minute cycle so don't send tasks on every request, then have to send out a big pile, so the send queue gets depleted in one hit instead of serving a few folks.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1851852 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1851853 - Posted: 27 Feb 2017, 14:50:41 UTC

The kitties must be doing something right.
Because through this all, they have always maintained nearly a full cache.
I am at 2500 out of 2500 presently.
I have no clue why or why not some are having difficulty.

And for once, the kittyman has no advice he can offer.
I usually can sort some things and render some things that could help.
Not presently.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1851853 · Report as offensive
1 · 2 · 3 · 4 . . . 34 · Next

Message boards : Number crunching : Panic Mode On (105) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.