The Server Issues / Outages Thread - Panic Mode On! (118)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 21 · 22 · 23 · 24 · 25 · 26 · 27 . . . 94 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2027041 - Posted: 10 Jan 2020, 0:20:43 UTC

These BLC 35s are bad news, most of them are Instant Overflows. The Results received in last hour is already up to 205,271 and I only have One machine running them so far. Once other machines start running them I believe we will be constantly Out of Work, https://setiathome.berkeley.edu/results.php?hostid=6796479&offset=1120
I seem to be getting a large number of stuck Uploads as well...
ID: 2027041 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2027046 - Posted: 10 Jan 2020, 1:07:10 UTC - in response to Message 2027041.  

These BLC 35s are bad news, most of them are Instant Overflows. The Results received in last hour is already up to 205,271 and I only have One machine running them so far. Once other machines start running them I believe we will be constantly Out of Work, https://setiathome.berkeley.edu/results.php?hostid=6796479&offset=1120
I seem to be getting a large number of stuck Uploads as well...


. . Yep. 90% or more of the Blc35 tasks are noise bombs. This is gonna add to the havoc ...

Stephen

:(
ID: 2027046 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2027054 - Posted: 10 Jan 2020, 1:39:32 UTC

I see they added some old Arecibo files to the splitters in an attempt to slow down the return rate from all the noise bombs.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2027054 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2027055 - Posted: 10 Jan 2020, 1:55:35 UTC

Now the Server isn't responding. One machine's cache is down 50% and can't contact the Server. Naturally the Results Received and RTS is showing the change, it appears no one can contact the Server....
ID: 2027055 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2027059 - Posted: 10 Jan 2020, 2:10:42 UTC

Website is very unresponsive as of this moment.

Tom
A proud member of the OFA (Old Farts Association).
ID: 2027059 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2027060 - Posted: 10 Jan 2020, 2:20:12 UTC

Yep, every host in scheduler backoff due to "internal server error". Can't report work and caches falling.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2027060 · Report as offensive
Profile Freewill Project Donor
Avatar

Send message
Joined: 19 May 99
Posts: 766
Credit: 354,398,348
RAC: 11,693
United States
Message 2027061 - Posted: 10 Jan 2020, 2:25:09 UTC - in response to Message 2027060.  

Yep, every host in scheduler backoff due to "internal server error". Can't report work and caches falling.

Ditto for me. Almost a full moon to howl at...
ID: 2027061 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2027062 - Posted: 10 Jan 2020, 2:28:13 UTC - in response to Message 2027061.  

Yep, every host in scheduler backoff due to "internal server error". Can't report work and caches falling.

Ditto for me. Almost a full moon to howl at...

Nothing here too. Caches holding. Ready to do my part.
ID: 2027062 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 2027063 - Posted: 10 Jan 2020, 2:32:40 UTC

Able to report now, at least. Not giving out work yet, though.
ID: 2027063 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2027066 - Posted: 10 Jan 2020, 3:22:27 UTC

Hey, with 36 tasks left, I started getting new work. Now to see if it lasts...
ID: 2027066 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 2027072 - Posted: 10 Jan 2020, 4:28:29 UTC
Last modified: 10 Jan 2020, 4:28:45 UTC

Got home to find Linux system out of GPU work due to extreme backoffs on downloads.
Grant
Darwin NT
ID: 2027072 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2027093 - Posted: 10 Jan 2020, 8:48:08 UTC - in response to Message 2027054.  
Last modified: 10 Jan 2020, 8:50:03 UTC

I see they added some ... {rest lost due to some error?}


. . Yep. I noticed that, most of my workload is now old arecibo work ...

Stephen

<shrug>
ID: 2027093 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34772
Credit: 261,360,520
RAC: 489
Australia
Message 2027096 - Posted: 10 Jan 2020, 9:56:24 UTC - in response to Message 2027054.  

I see they added some old Arecibo files to the splitters in an attempt to slow down the return rate from all the noise bombs.
And a lot of them here ATM have been the same. :-(

Cheers.
ID: 2027096 · Report as offensive
Profile tazzduke
Volunteer tester

Send message
Joined: 15 Sep 07
Posts: 190
Credit: 28,269,068
RAC: 5
Australia
Message 2027110 - Posted: 10 Jan 2020, 11:45:21 UTC

Well here is a few numbers to keep in mind

Results out in the field - 7,536,318

Results returned and awaiting validation - 7,240,993

First time this year I have seen the returns past 7 million.

Results received in last hour - 195,093

Cheers
ID: 2027110 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 2027119 - Posted: 10 Jan 2020, 13:26:36 UTC - in response to Message 2027110.  
Last modified: 10 Jan 2020, 13:27:11 UTC

Well here is a few numbers to keep in mind

Results out in the field - 7,536,318

Results returned and awaiting validation - 7,240,993

First time this year I have seen the returns past 7 million.

Results received in last hour - 195,093

Cheers


Interesting numbers:


ID: 2027119 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2027120 - Posted: 10 Jan 2020, 13:28:13 UTC

Lots of short tasks....

Tom
A proud member of the OFA (Old Farts Association).
ID: 2027120 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2027126 - Posted: 10 Jan 2020, 14:13:10 UTC - in response to Message 2027120.  

Yes, I'm seeing shorties (VHAR) now from 27au10ad. I also saw the noise bombs (overflows) earlier from BLC35.

Neither of those are going to help our bad drivers and bad cards. Eric put in a draconian attempt to try and weed those out yesterday:
Hopefully final validation mod to reduce bad results from failing GPUs
If 1 of 2 is overflow, quorum is increased to 3
If 1 of 3 is overflow, results are validated.
If 2 of 2 are overflow, quorum is increased to 3.
If 2 of 3 are overflow, quorum is increased to 4
If 3 of 3 are overflow, results are validated.
4 results are always validated.
so expect lots of resends, too.
ID: 2027126 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19065
Credit: 40,757,560
RAC: 67
United Kingdom
Message 2027128 - Posted: 10 Jan 2020, 14:25:44 UTC - in response to Message 2027126.  
Last modified: 10 Jan 2020, 14:32:04 UTC

If 3 of 3 are overflow, results are validated.

That is a bug.
If first two are ATI graphics cards, which will produce similar bad results, it doesn't matter what the third result is processed on, if it also -9's then the two ATI cards will agree and the correct third result will receive an "invalid".

See https://setiathome.berkeley.edu/forum_thread.php?id=84508&postid=2026843#2026843

I expect there to be many of them in this batch of blc35.... tasks
ID: 2027128 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2027132 - Posted: 10 Jan 2020, 14:35:05 UTC - in response to Message 2027128.  

If 3 of 3 are overflow, results are validated.

That is a bug.
If first two are ATI graphics cards, which will produce similar bad results, it doesn't matter what the third result is processed on, if it also -9's then the two ATI cards will agree and the correct third result will receive an error.
I think we should monitor for a while. If anyone notices this case in the future, please post a link and we can examine. I think the three would still have to be 'weakly similar', so if the early overflow signals were of a completely different type, that could (should?) be flagged as 'invalid'.
ID: 2027132 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19065
Credit: 40,757,560
RAC: 67
United Kingdom
Message 2027137 - Posted: 10 Jan 2020, 14:42:32 UTC - in response to Message 2027132.  

If 3 of 3 are overflow, results are validated.

That is a bug.
If first two are ATI graphics cards, which will produce similar bad results, it doesn't matter what the third result is processed on, if it also -9's then the two ATI cards will agree and the correct third result will receive an error.
I think we should monitor for a while. If anyone notices this case in the future, please post a link and we can examine. I think the three would still have to be 'weakly similar', so if the early overflow signals were of a completely different type, that could (should?) be flagged as 'invalid'.

IIRC in the case quoted the results were, 7 pulses & 23 triplets Vs 30 triplets
ID: 2027137 · Report as offensive
Previous · 1 . . . 21 · 22 · 23 · 24 · 25 · 26 · 27 . . . 94 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.