Panic Mode On (114) Server Problems?

Message boards : Number crunching : Panic Mode On (114) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 45 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13904
Credit: 208,696,464
RAC: 304
Australia
Message 1968779 - Posted: 5 Dec 2018, 9:26:52 UTC

And the splitters are still rather lethargic.
Grant
Darwin NT
ID: 1968779 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1968824 - Posted: 5 Dec 2018, 15:56:02 UTC
Last modified: 5 Dec 2018, 16:04:11 UTC

As Grant said, the splitters are not keeping up. RTS is falling at the moment.

edit: to keep up with 140k per hour being returned and needing to be replaced we need to have around 39/sec creation rate. Faster would build the RTS queue back up.
ID: 1968824 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 1968828 - Posted: 5 Dec 2018, 16:11:20 UTC - in response to Message 1968715.  


. . Where can I get some hair to tear out when I need it ..........

Stephen

:(


A wig?
A proud member of the OFA (Old Farts Association).
ID: 1968828 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1968885 - Posted: 5 Dec 2018, 22:40:27 UTC

Did anybody notice that the replica database has reappeared after half a year? Wonder what was wrong with it and what was needed to fix it?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1968885 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1968891 - Posted: 5 Dec 2018, 23:23:27 UTC - in response to Message 1968885.  

Did anybody notice that the replica database has reappeared after half a year? Wonder what was wrong with it and what was needed to fix it?
But it's currently 76,944 seconds (over 21 hours) behind the master.

Was it up-to-date last night, and not updated since - or was it even further behind, but catching up? I was in a train during the outage, and didn't notice the exact figures when I got home.
ID: 1968891 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1968896 - Posted: 5 Dec 2018, 23:51:15 UTC - in response to Message 1968891.  

Did anybody notice that the replica database has reappeared after half a year? Wonder what was wrong with it and what was needed to fix it?
But it's currently 76,944 seconds (over 21 hours) behind the master.

Was it up-to-date last night, and not updated since - or was it even further behind, but catching up? I was in a train during the outage, and didn't notice the exact figures when I got home.

Haveland graphs only have it restarting at 1800 UTC Wednesday. So it didn't reappear after yesterday's outage but in fact just today.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1968896 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1968903 - Posted: 6 Dec 2018, 0:25:35 UTC - in response to Message 1968885.  
Last modified: 6 Dec 2018, 0:27:31 UTC

Did anybody notice that the replica database has reappeared after half a year? Wonder what was wrong with it and what was needed to fix it?


. . The main thing would be that it is actually fixed ...

. . Fingers crossed.

. . I am still on the lookout for 'amigos' to appear :)

Stephen

:)
ID: 1968903 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13904
Credit: 208,696,464
RAC: 304
Australia
Message 1968945 - Posted: 6 Dec 2018, 8:16:07 UTC
Last modified: 6 Dec 2018, 8:19:13 UTC

Splitters and MB Validators are both still rather lethargic; although the Splitters are doing better than the Assimilators. The Assimilator backlog isn't clearing, but at least it isn't growing either.
Grant
Darwin NT
ID: 1968945 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1969154 - Posted: 7 Dec 2018, 21:33:02 UTC - in response to Message 1968945.  

Uh oh
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1969154 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1969156 - Posted: 7 Dec 2018, 21:36:51 UTC - in response to Message 1969154.  
Last modified: 7 Dec 2018, 21:47:12 UTC

Da, contemplating an email to Eric or not ...
EDIT: SSP Update doesn't show much improvement.
Mail Sent to the Master.
ID: 1969156 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1969158 - Posted: 7 Dec 2018, 21:47:58 UTC - in response to Message 1969156.  
Last modified: 7 Dec 2018, 22:03:47 UTC

Da, contemplating an email to Eric or not ...


. . NO splitters are running. The only new work are resends. I would guess someone is doing something, or they need to know there is a problem. Since it isn't quite 2pm in Berkeley on a Friday, now might be a good time to find out which ...

Stephen

? ?
ID: 1969158 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1969160 - Posted: 7 Dec 2018, 22:04:14 UTC

It's not only resends, I'm also getting new work, just not so often...
ID: 1969160 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1969162 - Posted: 7 Dec 2018, 22:08:00 UTC - in response to Message 1969160.  

It's not only resends, I'm also getting new work, just not so often...


. . According to the stats, the hopper (RTS) was pretty full about 10 to 15 mins ago so some of those tasks must be filtering out to the field.

Stephen

. .
ID: 1969162 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13904
Credit: 208,696,464
RAC: 304
Australia
Message 1969169 - Posted: 7 Dec 2018, 22:31:55 UTC

Ever since the last weekly outage the splitters & MB Validators have been rather marginal in their performance. And looking at the Haveland Graphs you can see there have been 2 other instances where the In Progress level has dropped- an indication of work not being sent out. More Scheduler shenanigans.

And the reason the splitters aren't running at the moment is the Ready-to-send buffer is full, and it's not going to get any lower if the Schedulers aren't going to dish out any work- "Project has no tasks available" has ben the response for over an hour now.
Grant
Darwin NT
ID: 1969169 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1969172 - Posted: 7 Dec 2018, 22:36:30 UTC

Yes, have run out of gpu work already. Nothing but Project has no tasks available messages.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1969172 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1969173 - Posted: 7 Dec 2018, 22:37:52 UTC - in response to Message 1969169.  

That is not surprising at all. With the replica database being brought back online, it is trying to catch up and creating a lot of extra load.
ID: 1969173 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1646
Credit: 12,921,799
RAC: 89
New Zealand
Message 1969176 - Posted: 7 Dec 2018, 22:49:18 UTC

Things are working as expected in my opinion. It took about 3 attempts I have a full load of work on board 100 tasks. One _2 task which I have pushed to the front of the queue
ID: 1969176 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1969185 - Posted: 7 Dec 2018, 23:18:38 UTC

07/12/2018 23:05:39 | SETI@home | Scheduler request completed: got 0 new tasks
07/12/2018 23:15:06 | SETI@home | Scheduler request completed: got 24 new tasks

Don't pay too much attention to transient, ephemeral, artefacts. Give the servers a second chance.
ID: 1969185 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13904
Credit: 208,696,464
RAC: 304
Australia
Message 1969186 - Posted: 7 Dec 2018, 23:24:35 UTC - in response to Message 1969185.  

07/12/2018 23:05:39 | SETI@home | Scheduler request completed: got 0 new tasks
07/12/2018 23:15:06 | SETI@home | Scheduler request completed: got 24 new tasks

Don't pay too much attention to transient, ephemeral, artefacts. Give the servers a second chance.

8/12/2018 07:20:45 | SETI@home | Project has no tasks available
8/12/2018 07:25:53 | SETI@home | Project has no tasks available
8/12/2018 07:31:02 | SETI@home | Project has no tasks available
8/12/2018 07:36:11 | SETI@home | Project has no tasks available
8/12/2018 07:41:20 | SETI@home | Project has no tasks available
8/12/2018 07:46:35 | SETI@home | Project has no tasks available
8/12/2018 07:51:45 | SETI@home | Project has no tasks available
8/12/2018 07:56:54 | SETI@home | Project has no tasks available
8/12/2018 08:02:04 | SETI@home | Project has no tasks available
8/12/2018 08:07:14 | SETI@home | Project has no tasks available
8/12/2018 08:12:23 | SETI@home | Project has no tasks available
8/12/2018 08:17:32 | SETI@home | Project has no tasks available
8/12/2018 08:22:42 | SETI@home | Project has no tasks available
8/12/2018 08:27:52 | SETI@home | Project has no tasks available
8/12/2018 08:33:02 | SETI@home | Project has no tasks available
8/12/2018 08:38:12 | SETI@home | Scheduler request completed: got 1 new tasks
8/12/2018 08:43:21 | SETI@home | Project has no tasks available
8/12/2018 08:48:31 | SETI@home | Project has no tasks available

In progress has dropped by around 250k over the last couple of hours.
Not what I would consider to be transient, ephemeral or an artefact. I'd say there is an issue when it comes to allocating work.
Grant
Darwin NT
ID: 1969186 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1969188 - Posted: 7 Dec 2018, 23:43:43 UTC

Check how many tasks the top machines have, https://setiathome.berkeley.edu/top_hosts.php

One of mine has around 400 out of 1100.
Another has 60 out of 500 and will be out in about 10 to 15 minutes.
ID: 1969188 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 45 · Next

Message boards : Number crunching : Panic Mode On (114) Server Problems?


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.