Panic Mode On (114) Server Problems?

Message boards : Number crunching : Panic Mode On (114) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 45 · Next

AuthorMessage
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1968756 - Posted: 5 Dec 2018, 3:50:50 UTC - in response to Message 1968753.  

There is a scheduled outage for Tuesdays. The start time is variable and the end time is variable. When the outage is done the seti servers get clogged with us all trying to return finished WUs and get new WUs at once. There is never enough WUs in the RTS queue when it does start handing out WUs again, so there are a variety of errors of communication that are "normal" for Tuesdays.

I'm familiar with that, but the second block of messages is from yesterday (Monday) on the other machine that gets work when it needs it. That's why I'm suggesting that is "normal" now for some reason.

edit: I'm concerned by the line mentioning you don't have a usable version of V8

On my main machine, I am set for AP-only, and my app_info only has AP in it. I made a work request, and the scheduler said that no AP was available, but it wanted to send me MBs, but my app_info doesn't have entries for MB, so it can't send me that work.

It does that even though the venue settings are set to NOT send work for other applications. It has done that for something like 3 years now.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1968756 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1968769 - Posted: 5 Dec 2018, 7:01:36 UTC

Looks like the MB Validators are falling behind, again.
Grant
Darwin NT
ID: 1968769 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1968779 - Posted: 5 Dec 2018, 9:26:52 UTC

And the splitters are still rather lethargic.
Grant
Darwin NT
ID: 1968779 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1968824 - Posted: 5 Dec 2018, 15:56:02 UTC
Last modified: 5 Dec 2018, 16:04:11 UTC

As Grant said, the splitters are not keeping up. RTS is falling at the moment.

edit: to keep up with 140k per hour being returned and needing to be replaced we need to have around 39/sec creation rate. Faster would build the RTS queue back up.
ID: 1968824 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 1968828 - Posted: 5 Dec 2018, 16:11:20 UTC - in response to Message 1968715.  


. . Where can I get some hair to tear out when I need it ..........

Stephen

:(


A wig?
A proud member of the OFA (Old Farts Association).
ID: 1968828 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1968885 - Posted: 5 Dec 2018, 22:40:27 UTC

Did anybody notice that the replica database has reappeared after half a year? Wonder what was wrong with it and what was needed to fix it?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1968885 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1968891 - Posted: 5 Dec 2018, 23:23:27 UTC - in response to Message 1968885.  

Did anybody notice that the replica database has reappeared after half a year? Wonder what was wrong with it and what was needed to fix it?
But it's currently 76,944 seconds (over 21 hours) behind the master.

Was it up-to-date last night, and not updated since - or was it even further behind, but catching up? I was in a train during the outage, and didn't notice the exact figures when I got home.
ID: 1968891 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1968896 - Posted: 5 Dec 2018, 23:51:15 UTC - in response to Message 1968891.  

Did anybody notice that the replica database has reappeared after half a year? Wonder what was wrong with it and what was needed to fix it?
But it's currently 76,944 seconds (over 21 hours) behind the master.

Was it up-to-date last night, and not updated since - or was it even further behind, but catching up? I was in a train during the outage, and didn't notice the exact figures when I got home.

Haveland graphs only have it restarting at 1800 UTC Wednesday. So it didn't reappear after yesterday's outage but in fact just today.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1968896 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1968903 - Posted: 6 Dec 2018, 0:25:35 UTC - in response to Message 1968885.  
Last modified: 6 Dec 2018, 0:27:31 UTC

Did anybody notice that the replica database has reappeared after half a year? Wonder what was wrong with it and what was needed to fix it?


. . The main thing would be that it is actually fixed ...

. . Fingers crossed.

. . I am still on the lookout for 'amigos' to appear :)

Stephen

:)
ID: 1968903 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1968945 - Posted: 6 Dec 2018, 8:16:07 UTC
Last modified: 6 Dec 2018, 8:19:13 UTC

Splitters and MB Validators are both still rather lethargic; although the Splitters are doing better than the Assimilators. The Assimilator backlog isn't clearing, but at least it isn't growing either.
Grant
Darwin NT
ID: 1968945 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1969154 - Posted: 7 Dec 2018, 21:33:02 UTC - in response to Message 1968945.  

Uh oh
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1969154 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1969156 - Posted: 7 Dec 2018, 21:36:51 UTC - in response to Message 1969154.  
Last modified: 7 Dec 2018, 21:47:12 UTC

Da, contemplating an email to Eric or not ...
EDIT: SSP Update doesn't show much improvement.
Mail Sent to the Master.
ID: 1969156 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1969158 - Posted: 7 Dec 2018, 21:47:58 UTC - in response to Message 1969156.  
Last modified: 7 Dec 2018, 22:03:47 UTC

Da, contemplating an email to Eric or not ...


. . NO splitters are running. The only new work are resends. I would guess someone is doing something, or they need to know there is a problem. Since it isn't quite 2pm in Berkeley on a Friday, now might be a good time to find out which ...

Stephen

? ?
ID: 1969158 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1969160 - Posted: 7 Dec 2018, 22:04:14 UTC

It's not only resends, I'm also getting new work, just not so often...
ID: 1969160 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1969162 - Posted: 7 Dec 2018, 22:08:00 UTC - in response to Message 1969160.  

It's not only resends, I'm also getting new work, just not so often...


. . According to the stats, the hopper (RTS) was pretty full about 10 to 15 mins ago so some of those tasks must be filtering out to the field.

Stephen

. .
ID: 1969162 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1969169 - Posted: 7 Dec 2018, 22:31:55 UTC

Ever since the last weekly outage the splitters & MB Validators have been rather marginal in their performance. And looking at the Haveland Graphs you can see there have been 2 other instances where the In Progress level has dropped- an indication of work not being sent out. More Scheduler shenanigans.

And the reason the splitters aren't running at the moment is the Ready-to-send buffer is full, and it's not going to get any lower if the Schedulers aren't going to dish out any work- "Project has no tasks available" has ben the response for over an hour now.
Grant
Darwin NT
ID: 1969169 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1969172 - Posted: 7 Dec 2018, 22:36:30 UTC

Yes, have run out of gpu work already. Nothing but Project has no tasks available messages.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1969172 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1969173 - Posted: 7 Dec 2018, 22:37:52 UTC - in response to Message 1969169.  

That is not surprising at all. With the replica database being brought back online, it is trying to catch up and creating a lot of extra load.
ID: 1969173 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1969176 - Posted: 7 Dec 2018, 22:49:18 UTC

Things are working as expected in my opinion. It took about 3 attempts I have a full load of work on board 100 tasks. One _2 task which I have pushed to the front of the queue
ID: 1969176 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1969185 - Posted: 7 Dec 2018, 23:18:38 UTC

07/12/2018 23:05:39 | SETI@home | Scheduler request completed: got 0 new tasks
07/12/2018 23:15:06 | SETI@home | Scheduler request completed: got 24 new tasks

Don't pay too much attention to transient, ephemeral, artefacts. Give the servers a second chance.
ID: 1969185 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 45 · Next

Message boards : Number crunching : Panic Mode On (114) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.