The Server Issues / Outages Thread - Panic Mode On! (119)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (119)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 68 · 69 · 70 · 71 · 72 · 73 · 74 . . . 107 · Next

AuthorMessage
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2044174 - Posted: 11 Apr 2020, 5:20:17 UTC - in response to Message 2044173.  
Last modified: 11 Apr 2020, 5:21:07 UTC

Forums slowed to a crawl & then the website went MIA. Was it a glitch or are we about to crash & burn again?

Not sure, but I seen that too. If I were a betting man, I'd default to my favorites from Hackers.

  Crash Override
 Acid Burn
ID: 2044174 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2044183 - Posted: 11 Apr 2020, 8:48:33 UTC

The remaining resends will be processed quickly despite having long deadlines. The few hosts that will still be accepting work a few weeks from now will have empty caches so they will return the results very fast and also they are unlikely to just disappear if they haven't already done so by that time.
ID: 2044183 · Report as offensive     Reply Quote
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19110
Credit: 40,757,560
RAC: 67
United Kingdom
Message 2044194 - Posted: 11 Apr 2020, 10:40:10 UTC - in response to Message 2044183.  
Last modified: 11 Apr 2020, 10:41:33 UTC

The remaining resends will be processed quickly despite having long deadlines. The few hosts that will still be accepting work a few weeks from now will have empty caches so they will return the results very fast and also they are unlikely to just disappear if they haven't already done so by that time.

Actually the hosts have probably signed up to projects with short turn round times and therefore Seti tasks will sit on the back burner until close to the expiry date.
ID: 2044194 · Report as offensive     Reply Quote
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 716
Credit: 8,032,827
RAC: 62
France
Message 2044198 - Posted: 11 Apr 2020, 11:29:24 UTC

with the cleared backlog, need to reactivate the server ghost script but in another way, TO NOT RESENT WU GHOSTED ON HOSTS WITH AntiVirus PB ^^ but to OTHERS ... if not, it will enter in an endless loop
ID: 2044198 · Report as offensive     Reply Quote
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24881
Credit: 3,081,182
RAC: 7
Ireland
Message 2044200 - Posted: 11 Apr 2020, 11:37:52 UTC - in response to Message 2044194.  

Actually the hosts have probably signed up to projects with short turn round times and therefore Seti tasks will sit on the back burner until close to the expiry date.
Yep, had that happen to me yesterday, so suspended project until all Seti tasks completed.
ID: 2044200 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2044286 - Posted: 11 Apr 2020, 21:27:23 UTC

Well. SETI was my only project, and will remain so. I'll keep the machines doing what they do until it no longer runs. Then I'll uninstall boinc.
ID: 2044286 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13755
Credit: 208,696,464
RAC: 304
Australia
Message 2044295 - Posted: 11 Apr 2020, 21:56:48 UTC

I see the Scheduler has had issues again, although it appears to coincide with general status updating problems- there was a flat line of the graphs for a few other processes around the time the Scheduler started to fall behind again.
At least it's catching up fairly quickly.
Grant
Darwin NT
ID: 2044295 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2044339 - Posted: 12 Apr 2020, 6:33:23 UTC
Last modified: 12 Apr 2020, 6:33:44 UTC

Here we get to the end fo this project, and I'm finally getting my windows laptop to play with my nix boxes. openssh is installed, and powershell on my nix boxes. Vim installed on my laptop. It's almost cool. Would have been fun playing with all of this with a running project :)
ID: 2044339 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2044343 - Posted: 12 Apr 2020, 7:16:18 UTC - in response to Message 2044198.  

with the cleared backlog, need to reactivate the server ghost script but in another way, TO NOT RESENT WU GHOSTED ON HOSTS WITH AntiVirus PB ^^ but to OTHERS ... if not, it will enter in an endless loop
It won't be an endless loop. Ghost won't get new deadlines on resend, so they will expire on the original schedule and will then get resent to other users.
ID: 2044343 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2044344 - Posted: 12 Apr 2020, 7:19:14 UTC

Assimilation queue hasn't been this small since early February.
ID: 2044344 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14654
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2044347 - Posted: 12 Apr 2020, 7:48:48 UTC - in response to Message 2044344.  

I'm watching the assimilation queue - 2,221,676 at 07:40 UTC - and my personal total of valid but unpurged tasks - currently 8521.

In theory, they should both hit ground zero at the same time (near enough), but if they arrive separately it'll suggest another problem disguised in the noise.
ID: 2044347 · Report as offensive     Reply Quote
BetelgeuseFive Project Donor
Volunteer tester

Send message
Joined: 6 Jul 99
Posts: 158
Credit: 17,117,787
RAC: 19
Netherlands
Message 2044350 - Posted: 12 Apr 2020, 8:11:45 UTC

Came across this computer: https://setiathome.berkeley.edu/show_host_detail.php?hostid=8873201
It has 15k tasks in progress (running BOINC 7.17.0). How is this possible ?

Tom
ID: 2044350 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2044353 - Posted: 12 Apr 2020, 8:29:43 UTC - in response to Message 2044350.  
Last modified: 12 Apr 2020, 8:30:54 UTC

Came across this computer: https://setiathome.berkeley.edu/show_host_detail.php?hostid=8873201
It has 15k tasks in progress (running BOINC 7.17.0). How is this possible ?

Tom

Really doesn't matter how. Looking at what their system can actually do; means they're going to have a lot of timed out tasks.
https://setiathome.berkeley.edu/host_app_versions.php?hostid=8873201

Edit: not to mention that this system hasn't contacted the servers since the 8th.
ID: 2044353 · Report as offensive     Reply Quote
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 2044357 - Posted: 12 Apr 2020, 9:20:08 UTC - in response to Message 2044350.  
Last modified: 12 Apr 2020, 9:48:25 UTC

Came across this computer: https://setiathome.berkeley.edu/show_host_detail.php?hostid=8873201
It has 15k tasks in progress (running BOINC 7.17.0). How is this possible ?

Probably cheater with faked number of GPUs (the system is claiming to have 64 GPUs, how likely is that?) to get around the limits set by the project staff to limit the load on the database. People doing this should be banned from the project IHMO, at least for a week or so. Some of the tasks can also be ghosts.
ID: 2044357 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2044358 - Posted: 12 Apr 2020, 9:40:54 UTC - in response to Message 2044350.  

Came across this computer: https://setiathome.berkeley.edu/show_host_detail.php?hostid=8873201
It has 15k tasks in progress (running BOINC 7.17.0). How is this possible ?

Tom
Hasn't returned anything after the April 2nd but has kept contacting for several days longer. So most likely an empty cache but a lot of ghosts.
ID: 2044358 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14654
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2044360 - Posted: 12 Apr 2020, 9:51:31 UTC - in response to Message 2044357.  

And, more specifically, that host hasn't contacted the servers since 8 April.

That's not a major problem - the servers have had plenty of other things to worry about recently - but looking at the Munin graphs in detail, the assimilation queue has declined from 4.82M to 2.07M in 36 hours - and at a very uniform slope. That process should be complete in a further 27 hours, around midday UTC Monday 13 April.

It's been pointed out elsewhere that the user has probably switched to another project, and BOINC will be giving that project priority until the resource shares even out. Fair enough, although with the owner having chosen anonymity, we can't see or link to their other project, if any.

But the user must have put some effort into building and configuring a beast like that, and presumably they did it specifically for the benefit of SETI@Home. In that case, I would expect them to pause the replacement project, and return to SETI crunching, as soon as the servers are ready to receive and process the returned tasks. Wingmates are waiting for the returns, too, and the project staff are waiting to get on with their analysis. Let the cleanup commence on Monday afternoon.
ID: 2044360 · Report as offensive     Reply Quote
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 2044361 - Posted: 12 Apr 2020, 9:54:26 UTC

Once the assimilators have catched up, they should reenable the ghost resending mechanism on the servers (the full version of it), the servers should now be able to handle that additional load.
ID: 2044361 · Report as offensive     Reply Quote
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51469
Credit: 1,018,363,574
RAC: 1,004
United States
Message 2044362 - Posted: 12 Apr 2020, 9:56:30 UTC - in response to Message 2044361.  

Once the assimilators have catched up, they should reenable the ghost resending mechanism on the servers (the full version of it), the servers should now be able to handle that additional load.

I'll second that motion.

Meow!
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 2044362 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13755
Credit: 208,696,464
RAC: 304
Australia
Message 2044363 - Posted: 12 Apr 2020, 9:59:54 UTC

Ah, we're back.
I was getting nothing but a dozen or so messages about missing drives on Carolyn for a while there.
Grant
Darwin NT
ID: 2044363 · Report as offensive     Reply Quote
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 2044365 - Posted: 12 Apr 2020, 10:23:36 UTC - in response to Message 2044362.  

Once the assimilators have catched up, they should reenable the ghost resending mechanism on the servers (the full version of it), the servers should now be able to handle that additional load.

I'll second that motion.

Meow!

Perhaps you can send a message to Eric like you did in the past (IIRC).
ID: 2044365 · Report as offensive     Reply Quote
Previous · 1 . . . 68 · 69 · 70 · 71 · 72 · 73 · 74 . . . 107 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (119)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.