Panic Mode On (109) Server Problems?

Message boards : Number crunching : Panic Mode On (109) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 36 · Next

AuthorMessage
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1906956 - Posted: 14 Dec 2017, 6:18:01 UTC

...and the SSP is now hiding in a corner with the lights turned off :-(
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1906956 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1906958 - Posted: 14 Dec 2017, 6:29:29 UTC
Last modified: 14 Dec 2017, 6:41:42 UTC

Looks like it took about 7 hours, but the splitters have finally come good. Managing to pump out plenty of work as required. The assimilation backlog is clearing.
Unfortunately the WU awaiting deletion backlog continues to grow (it was falling while the assimilator backlog was growing), and the replica continues to fall further & further behind.


EDIT- and the forums are very random as to whether they will load, or take a minute or 2 to do it... (same for looking at accounts and tasks/systems).
Grant
Darwin NT
ID: 1906958 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1906968 - Posted: 14 Dec 2017, 7:06:49 UTC
Last modified: 14 Dec 2017, 7:18:55 UTC

Anyone had successful contact with the Scheduler in the last 25min?
14/12/2017 16:19:35 | SETI@home | Scheduler request failed: Couldn't connect to server
14/12/2017 16:27:54 | SETI@home | Scheduler request failed: Couldn't connect to server
14/12/2017 16:29:53 | SETI@home | Scheduler request failed: Couldn't connect to server
14/12/2017 16:35:17 | SETI@home | Scheduler request failed: HTTP internal server error
14/12/2017 16:45:49 | SETI@home | Scheduler request failed: HTTP service unavailable
Grant
Darwin NT
ID: 1906968 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 1906972 - Posted: 14 Dec 2017, 7:21:31 UTC - in response to Message 1906968.  

Anyone had successful contact with the Scheduler in the last 25min?


No, same hear. Communication deferred for 20 minutes. Hopefully not a sign of database issues.

Forum also very slow to respond.
ID: 1906972 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1906978 - Posted: 14 Dec 2017, 7:44:43 UTC - in response to Message 1906968.  

Anyone had successful contact with the Scheduler in the last 25min?
Yup.

14/12/2017 07:40:35 | SETI@home | Sending scheduler request: To fetch work.
14/12/2017 07:40:35 | SETI@home | Reporting 1 completed tasks
14/12/2017 07:40:35 | SETI@home | Requesting new tasks for NVIDIA GPU
14/12/2017 07:40:39 | SETI@home | Scheduler request completed: got 77 new tasks
One of my other machines had been failing for a little while, but got through just as I was starting my morning checks.
ID: 1906978 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1906980 - Posted: 14 Dec 2017, 7:48:38 UTC

Just looked at my machines. They had a few instances of no communication a half hour ago. All machines were able to get work in the last few minutes and are fully cached. Don't like the fact the SSP has gone missing again. I had hoped they would fix the ever growing task load on the deleters during the extended outage. Things are NOT running normally.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1906980 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1906981 - Posted: 14 Dec 2017, 7:50:27 UTC - in response to Message 1906978.  

Anyone had successful contact with the Scheduler in the last 25min?
Yup.

14/12/2017 07:40:35 | SETI@home | Sending scheduler request: To fetch work.
14/12/2017 07:40:35 | SETI@home | Reporting 1 completed tasks
14/12/2017 07:40:35 | SETI@home | Requesting new tasks for NVIDIA GPU
14/12/2017 07:40:39 | SETI@home | Scheduler request completed: got 77 new tasks
One of my other machines had been failing for a little while, but got through just as I was starting my morning checks.

Next request, it managed to find signs of life.
14/12/2017 17:01:19 | SETI@home | Scheduler request completed: got 26 new tasks.

Hopefully it'll hang in there till morning when Eric can have another look at what's going on.
Grant
Darwin NT
ID: 1906981 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1906988 - Posted: 14 Dec 2017, 8:14:04 UTC

Add to the SSP being broken. the forum is slow, and a few random profile pages are "broken".
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1906988 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1906992 - Posted: 14 Dec 2017, 8:33:53 UTC - in response to Message 1906988.  

And the fundraiser icons are still coming and going like yo-yos.
ID: 1906992 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1907019 - Posted: 14 Dec 2017, 11:31:25 UTC

The haveland graphs (and my RAC) look more like a blueprint design for a new LEGO build than a stable system.
ID: 1907019 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1907025 - Posted: 14 Dec 2017, 13:24:37 UTC
Last modified: 14 Dec 2017, 13:53:35 UTC

Looks to be a number of problems from Pages just being Slow to pages not displaying.
Missing;
https://setiathome.berkeley.edu/show_server_status.php
https://setiathome.berkeley.edu/top_hosts.php?sort_by=expavg_credit&offset=20
ID: 1907025 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1907030 - Posted: 14 Dec 2017, 14:46:27 UTC

It's somewhat random which pages vanish, as some of them re-appear. However it does appear to be getting more widespread and persistent. I think a common thread is that they are all data-driven pages, so that might be a clue (or it might be a read herring)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1907030 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1907034 - Posted: 14 Dec 2017, 15:06:48 UTC - in response to Message 1907030.  

But on the other hand, I had problems a few minutes ago refreshing these boards, with the browser pausing at 'establishing secure connection'. Surely that would happen a long way down into the web server hierarchy, long before it got to binding data into a driven web page?
ID: 1907034 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1907053 - Posted: 14 Dec 2017, 21:00:25 UTC

Wish we had some news about the difficulties the project is having. The SSP numbers are still whack, especially the db purging. I haven't been able to get any work on the Linux cruncher since the project came back this morning. I can't view any host because the request times out. I wish they would tell us what the issue is and what they need to do and then just take the project down for however long it takes to fix things correctly.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1907053 · Report as offensive
Profile Piotr

Send message
Joined: 24 May 17
Posts: 18
Credit: 20,069,282
RAC: 41
Poland
Message 1907054 - Posted: 14 Dec 2017, 21:03:08 UTC

Again can't get WU's, and my slower cruncher got more in advance than my faster.
Also noted that total score graph climbing rate diminished by the factor of +/-two on both machines.
RAC graph seismographic behavior, looks like S@H is giving less points for the job done, than before outrage time.

14.12.2017 22:00:50 | SETI@home | Project has no tasks available ... why ?
ID: 1907054 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1907078 - Posted: 14 Dec 2017, 22:45:25 UTC
Last modified: 14 Dec 2017, 22:45:46 UTC

Fixed a little earlier than christmas... LOL

Thu 14 Dec 2017 05:40:13 PM EST | SETI@home | Scheduler request completed: got 106 new tasks
ID: 1907078 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1907083 - Posted: 14 Dec 2017, 22:54:23 UTC

I'm still getting nothing but no tasks are available across all machines. I've tried all the tricks. TBar's Triple update and the server tire kick. Nothing coming in. I'm down to 50 cpu and 65 gpu tasks on the Linux cruncher. Won't be long before I'm out of gpu and I guess my new Einstein backup project gets the love.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1907083 · Report as offensive
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 715
Credit: 8,032,827
RAC: 62
France
Message 1907087 - Posted: 14 Dec 2017, 23:02:04 UTC

i have different values of RAC between the graphic area (6257 ) and the project tab ( 5611) of Boinc manager ....
ID: 1907087 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1907090 - Posted: 14 Dec 2017, 23:10:55 UTC - in response to Message 1907083.  

I'm still getting nothing but no tasks are available across all machines. I've tried all the tricks. TBar's Triple update and the server tire kick. Nothing coming in. I'm down to 50 cpu and 65 gpu tasks on the Linux cruncher. Won't be long before I'm out of gpu and I guess my new Einstein backup project gets the love.


. . Well after the unscheduled 3 hour outage a few hours ago I am now getting mostly "no tasks available". Won't be long until I am out of work again. :(

. . Maybe I should give the guys a rest anyway with the heat we've been having lately. A/C isn't coping and with the wick turned way up it it still 30 C in here :(

. . Oh well ... I'll give it another hour ...

Stephen

:)
ID: 1907090 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1907092 - Posted: 14 Dec 2017, 23:18:41 UTC

Well the Linux cruncher got a pardon. I was down to a dozen gpu tasks. What was really hurting me was the scheduler was constantly putting my gpu's into 10 and 20 minute backoffs.

So a question for the group. What criteria does the scheduler use to put a host into backoff. For both cpu and gpu conditions?

And what is the correct way to utilize the many Event Log options to clear said backoffs. I just keep fumbling around with the various options and looking at the project Properties until I see the backoff get removed. I don't know which if any option that I have been toggling is the one that clears the backoff. Would be nice to know which option is guaranteed to get rid of the backoff.

Anyone know?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1907092 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 36 · Next

Message boards : Number crunching : Panic Mode On (109) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.