Panic Mode On (108) Server Problems?

Message boards : Number crunching : Panic Mode On (108) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 26 · 27 · 28 · 29

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 1905854 - Posted: 9 Dec 2017, 6:12:03 UTC

I seem to be getting all the noisy GBT WUs that were mentioned earlier. Reporting 10+ results at a time at the moment.
Grant
Darwin NT
ID: 1905854 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1905857 - Posted: 9 Dec 2017, 6:22:29 UTC - in response to Message 1905851.  

Thanks for the explanation. Do you have to set NNT before hitting Update?
It only takes 2 updates, but it must be 2 that are NOT returning tasks in a row. The third is if you need to report something. It can be a bit tricky at times with fast GPUs to find the right time to make the 'nudge'.

I generally don't have to nudge them often. When I seen your messages here I checked and was down >10% on them all. But they all recovered on their own since, some back down now. I very rarely see them drop below 66% without them recovering on their own.

When this type of issue develops on my machines, it is always dire. I rarely see them recover on their own without some intervention. I can go for several hours during the wee hours when I'm sleeping and no machine will be running tasks from looking at the Event Logs on the machines. The first task of the morning is to open up BoincTasks and see what state the machines are in. Then round robin to kick every one over to get tasks again.

I had a long run of no issues up till the last server hiccup last week. Longest I can remember back to the 'ole days when the servers and machines just ran without intervention on my part.

The Triple Update would never work on the Linux machine I guess since it is always reporting tasks every five minutes .... that is when it has work. The Windows machines have been making a slow recovery back to full. The Linux machine continues to fall.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1905857 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 1905862 - Posted: 9 Dec 2017, 6:29:40 UTC - in response to Message 1905857.  

The Triple Update would never work on the Linux machine I guess since it is always reporting tasks every five minutes

My i7 with 2*GTX1070s is the one that has the most issues, even though on every Scheduler request it is reporting work. The triple update is what gets it downloading work again.
Grant
Darwin NT
ID: 1905862 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1905863 - Posted: 9 Dec 2017, 6:35:00 UTC - in response to Message 1905857.  

The Triple Update would never work on the Linux machine I guess since it is always reporting tasks every five minutes
That doesn't matter. Just pick any time that you can ... Report tasks, then make 2 more updates before reporting any more.
ID: 1905863 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1905865 - Posted: 9 Dec 2017, 6:39:32 UTC - in response to Message 1905863.  

OK, got it. Was confused as to timing. Just tried the technique. We'll see.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1905865 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36816
Credit: 261,360,520
RAC: 489
Australia
Message 1905866 - Posted: 9 Dec 2017, 6:50:42 UTC

According to my logs there's been no problems getting work here.

Cheers.
ID: 1905866 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1905867 - Posted: 9 Dec 2017, 6:57:03 UTC - in response to Message 1905865.  

I just tried through 3 cycles. I must have not done it right the first 2 times cause I still got the no tasks are available message. The third try was either correct or a consequence, but I got 79 tasks after the technique. I was about to do the ghost recovery protocol since I was done over 120 tasks. I'll see if the next request pulls me back flush. It did. Yay!
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1905867 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1905868 - Posted: 9 Dec 2017, 7:00:31 UTC - in response to Message 1905867.  

Cool.
ID: 1905868 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13855
Credit: 208,696,464
RAC: 304
Australia
Message 1905871 - Posted: 9 Dec 2017, 7:14:03 UTC - in response to Message 1905866.  

According to my logs there's been no problems getting work here.

As I said, it's a particularly weird issue.
Grant
Darwin NT
ID: 1905871 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22535
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1905881 - Posted: 9 Dec 2017, 8:42:50 UTC - in response to Message 1905748.  

David - you are correct Beta data does not get into the main database. Runs on Beta can be for all sorts of reasons but are all to do with testing something in a near production environment.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1905881 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1905898 - Posted: 9 Dec 2017, 11:55:19 UTC - in response to Message 1905854.  
Last modified: 9 Dec 2017, 12:02:03 UTC

I seem to be getting all the noisy GBT WUs that were mentioned earlier. Reporting 10+ results at a time at the moment.


. . I have never seen so many noisy GBT WUs as I have in the past few weeks. It's a bit of a worry for a "radio noise free" location such as Greenbank.

Stephen

??
ID: 1905898 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1905899 - Posted: 9 Dec 2017, 12:00:30 UTC - in response to Message 1905866.  

According to my logs there's been no problems getting work here.

Cheers.


. . Oh you always say that :) ... except for the big outage the other week. But I must admit, apart from a few "no tasks available" responses, this weekend I am getting work pretty consistently myself.

Stephen

<touching wood>
ID: 1905899 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1905903 - Posted: 9 Dec 2017, 12:16:24 UTC - in response to Message 1905898.  

I have never seen so many noisy GBT WUs as I have in the past few weeks. It's a bit of a worry for a "radio noise free" location such as Greenbank.
Maybe they've just tuned in to ET's equivalent of 'I Love Lucy', and our clients are being snooty about it?
ID: 1905903 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1905962 - Posted: 9 Dec 2017, 18:00:30 UTC

For me the Results pages are Slow to respond, and I'm having trouble getting tasks for two machines. One of the machines has yet to respond and is getting very low on tasks. I may have to try the Bunker option for tonight's coldest night of the season.
ID: 1905962 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1905967 - Posted: 9 Dec 2017, 18:22:05 UTC - in response to Message 1905962.  
Last modified: 9 Dec 2017, 18:24:20 UTC

I am seeing that too because of the assimilators and purgers being offline for the past day. The Valid tasks are building rapidly and aren't being trimmed in the database. Things are getting slow. The Linux cruncher is continuing to have troubles getting work. For some reason the scheduler keeps throwing 10-20 minute GPU backoffs at it.
[Edit] Just looked and the assimilators have made an appearance finally. That should improve things shortly. Someone must be in the lab this morning.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1905967 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1905976 - Posted: 9 Dec 2017, 19:21:54 UTC - in response to Message 1905903.  

I have never seen so many noisy GBT WUs as I have in the past few weeks. It's a bit of a worry for a "radio noise free" location such as Greenbank.
Maybe they've just tuned in to ET's equivalent of 'I Love Lucy', and our clients are being snooty about it?


. . Being picky ? :)

Stephen

:)
ID: 1905976 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1905980 - Posted: 9 Dec 2017, 19:51:21 UTC - in response to Message 1905967.  
Last modified: 9 Dec 2017, 19:53:32 UTC

I am seeing that too because of the assimilators and purgers being offline for the past day. The Valid tasks are building rapidly and aren't being trimmed in the database. Things are getting slow. The Linux cruncher is continuing to have troubles getting work. For some reason the scheduler keeps throwing 10-20 minute GPU backoffs at it.
[Edit] Just looked and the assimilators have made an appearance finally. That should improve things shortly. Someone must be in the lab this morning.


. . Many thanks to whoever took the time out of their Saturday. But I had a run of "no tasks available" overnight, gave it the double kick this am and now getting work again (though my cache is 80 - 90% GBT tasks). Still have about double the normal numbers of valid tasks, hope it clears soon.

Stephen

..
ID: 1905980 · Report as offensive
Previous · 1 . . . 26 · 27 · 28 · 29

Message boards : Number crunching : Panic Mode On (108) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.