The Server Issues / Outages Thread - Panic Mode On! (118)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 94 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13770
Credit: 208,696,464
RAC: 304
Australia
Message 2024801 - Posted: 24 Dec 2019, 8:19:38 UTC - in response to Message 2024773.  
Last modified: 24 Dec 2019, 8:20:25 UTC

Uhh, don't know where you read that the project was back to normal. It is not. Won't be until after tomorrow's outage/backup/recovery. (Fingers crossed.)
Maybe, hopefully, if we're very, very lucky.

It would still be nice if they sorted out what caused everything to slow down in the first place, was it the re-enabling of Resend lost tasks, or something else?
Because while Anonymous Platform systems can't get work, systems running stock were still getting the odd Scheduler error or "Project has no tasks available", even with the extremely reduced server load, which was why I decided to just go back to Anonymous Platform and wait for a fix (along with the BOINC Manager deciding CUDA42 was best- 30min+ for something SoG was doing in 6min 30s or 9min depending on the hardware).
Grant
Darwin NT
ID: 2024801 · Report as offensive
bluestar

Send message
Joined: 5 Sep 12
Posts: 7099
Credit: 2,084,789
RAC: 3
Message 2024806 - Posted: 24 Dec 2019, 8:59:57 UTC - in response to Message 2024801.  
Last modified: 24 Dec 2019, 9:00:23 UTC

Only a waste of time for the whole thing here, Grant, because the SoG CUDA tasks might report a bit more on the server, than the corresponding CUDA50 tasks here.
ID: 2024806 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19155
Credit: 40,757,560
RAC: 67
United Kingdom
Message 2024814 - Posted: 24 Dec 2019, 11:25:53 UTC - in response to Message 2024801.  
Last modified: 24 Dec 2019, 11:27:43 UTC

Uhh, don't know where you read that the project was back to normal. It is not. Won't be until after tomorrow's outage/backup/recovery. (Fingers crossed.)
Maybe, hopefully, if we're very, very lucky.

It would still be nice if they sorted out what caused everything to slow down in the first place, was it the re-enabling of Resend lost tasks, or something else?
Because while Anonymous Platform systems can't get work, systems running stock were still getting the odd Scheduler error or "Project has no tasks available", even with the extremely reduced server load, which was why I decided to just go back to Anonymous Platform and wait for a fix (along with the BOINC Manager deciding CUDA42 was best- 30min+ for something SoG was doing in 6min 30s or 9min depending on the hardware).

I don't know where you're getting the idea that returning to stock was a waste of time. I did it reasonably soon after the problem was identified and after a few attempts to get tasks and get back to crunching, all has gone as one would expect. There was a drop off in performance as the system had to go through the questionable decision which are the best apps. Once that was decided as being the "SETI@home v8 8.22 windows_intelx86 (opencl_nvidia_SoG)" for my GPU, I admit I aborted the remaining CUDA tasks and it has run the SoG app without breaks ever since. My cache is full and performance only down by ~15% as judged by RAC. And some of that might be because the computer has only grabbed one AP task in this period. This for Windows, obviously special sauce Linux performance is a different matter.
ID: 2024814 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2024815 - Posted: 24 Dec 2019, 11:37:11 UTC

Hope all will be fine and we could be back to run the anonymous apps after the outage.
Back in a couple of hours to see what we could expect.
Fingers crossed.
ID: 2024815 · Report as offensive
Profile NorthCup

Send message
Joined: 6 Jun 99
Posts: 108
Credit: 50,093,984
RAC: 5
Germany
Message 2024820 - Posted: 24 Dec 2019, 12:48:05 UTC
Last modified: 24 Dec 2019, 12:52:09 UTC

All Linux hosts are here without a job. I have reset BOINC at my last Win computer. He got 70 tasks without any problems and now needs twice the time per WU. I think it is time to finally optimize the clients. For all operating systems! We hav'nt money to throw it out the window. Merry Christmas to all Setians!
ID: 2024820 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14656
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2024821 - Posted: 24 Dec 2019, 13:01:09 UTC

I only set one Windows host back to stock so I could watch what was happening. I've now reported the last stock tasks in anticipation of an early outage, and restored the host to normal. It'll be patiently waiting for anonymous platform work whenever we come back online.

My one fast Linux 'special sauce' box has a full cache and should see out the outage (if it isn't excessively long). I'll restore that one before I leave for my own holiday break tomorrow, and it can recover by itself from there.
ID: 2024821 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13770
Credit: 208,696,464
RAC: 304
Australia
Message 2024826 - Posted: 25 Dec 2019, 1:50:00 UTC

Well, we're back.

25/12/2019 11:18:01 | SETI@home | Scheduler request failed: Couldn't connect to server
But it did only take 4 seconds to error out, which is better than 30sec to a couple of minutes.
Grant
Darwin NT
ID: 2024826 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2024830 - Posted: 25 Dec 2019, 1:56:22 UTC

Can't connect to server.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2024830 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13770
Credit: 208,696,464
RAC: 304
Australia
Message 2024831 - Posted: 25 Dec 2019, 1:59:39 UTC - in response to Message 2024830.  
Last modified: 25 Dec 2019, 2:00:28 UTC

Can't connect to server.
Yeah, but at least it only takes 4 secs now and not a couple of minutes...


How about stock hosts?
Grant
Darwin NT
ID: 2024831 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13770
Credit: 208,696,464
RAC: 304
Australia
Message 2024832 - Posted: 25 Dec 2019, 2:03:07 UTC

Just had a look at the server page. Transitioners are all down, Feeder is down.

And the Scheduling server is Disabled...
So we're not quite back yet.
Grant
Darwin NT
ID: 2024832 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2024833 - Posted: 25 Dec 2019, 2:03:43 UTC - in response to Message 2024831.  

Can't connect to server.
Yeah, but at least it only takes 4 secs now and not a couple of minutes...


How about stock hosts?


same on my stock hosts. cant connect, responds in a few seconds.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2024833 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13770
Credit: 208,696,464
RAC: 304
Australia
Message 2024835 - Posted: 25 Dec 2019, 2:27:47 UTC

Server status now showing almost all functions as Disabled.

Looks like the restart after the outage is having it's own issues.
Grant
Darwin NT
ID: 2024835 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2024837 - Posted: 25 Dec 2019, 2:36:16 UTC - in response to Message 2024836.  

At least we're back to Server version 709 now


+1

I am still crunching through a bunch of Linux stock GPU tasks so I will continue being NNT until I can re-start my AIO stuff.

Tom
A proud member of the OFA (Old Farts Association).
ID: 2024837 · Report as offensive
Profile Freewill Project Donor
Avatar

Send message
Joined: 19 May 99
Posts: 766
Credit: 354,398,348
RAC: 11,693
United States
Message 2024838 - Posted: 25 Dec 2019, 2:40:10 UTC

Glad to see ver 709! Like Tom, and perhaps many others, I have to clear my stock queue and report before I switch back to special sauce.

Happy Holidays and Happy Crunching! Hope it's all good in the morning. Time for bed here.
ID: 2024838 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2024840 - Posted: 25 Dec 2019, 2:42:49 UTC
Last modified: 25 Dec 2019, 2:49:45 UTC

Let`s wait a couple of hours for the system to stabilize and then we could think to power up our hungry hosts.

Time for Christmas dinner, wines & spend some time with the family.
ID: 2024840 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2024842 - Posted: 25 Dec 2019, 2:46:39 UTC - in response to Message 2024840.  

Let`s wait a couple of hours for the system to stabilize and then we could think to power up out hungry hosts.

Time for Christmas dinner, wines & spend some time with the family.



+1

Or maybe even till tomorrow! :)

Tom
A proud member of the OFA (Old Farts Association).
ID: 2024842 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2024843 - Posted: 25 Dec 2019, 2:47:18 UTC

my Anonymous platform host just got 14 new tasks :)
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2024843 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11366
Credit: 29,581,041
RAC: 66
United States
Message 2024846 - Posted: 25 Dec 2019, 2:56:28 UTC - in response to Message 2024843.  

my Anonymous platform host just got 14 new tasks :)

I would like some also.
ID: 2024846 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65864
Credit: 55,293,173
RAC: 49
United States
Message 2024847 - Posted: 25 Dec 2019, 2:57:50 UTC
Last modified: 25 Dec 2019, 2:59:58 UTC

Zippo here since I reported 12 wu's.

And please no dinner, I feel bad and it's something that I ate that was brought here.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 2024847 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20500
Credit: 7,508,002
RAC: 20
United Kingdom
Message 2024848 - Posted: 25 Dec 2019, 2:59:08 UTC - in response to Message 2024759.  
Last modified: 25 Dec 2019, 2:59:56 UTC

Hence the statement "revert the database"...

Or more aptly:

Reverse the polarity of the neutron flow?

;-)

We have coinciding with the s@h server reversion... An astronomical real-world reversion:

Reversed polarity sunspots!


Enjoy! :-)

Happy Festivities!!

Keep searchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 2024848 · Report as offensive
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 94 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.