The Server Issues / Outages Thread - Panic Mode On! (118)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 94 · Next

AuthorMessage
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2024703 - Posted: 23 Dec 2019, 19:55:29 UTC - in response to Message 2024701.  

BOINC seems to be very allowing when you want to do your own thing. I mean someone had the foresight to create the <dont_check_file_sizes> flag seemingly just to allow people to do what they want. And I'm glad they did for instances like this. It basically lets you run anonymous platform when not under anonymous platform lol.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2024703 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 2024707 - Posted: 23 Dec 2019, 20:03:30 UTC

Both my hosts have stalled downloads now...
ID: 2024707 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2024728 - Posted: 23 Dec 2019, 21:44:09 UTC
Last modified: 23 Dec 2019, 21:44:31 UTC

I'm shooting down my host until the issue where corrected. Hope soon.

For those who celebrate i wish a Merry Christmas.
ID: 2024728 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2024733 - Posted: 23 Dec 2019, 22:26:00 UTC

. . On the one machine I changed to stock (renamed app_info.xml) I received a good number of tasks over several hours, but since implementing the AnonPlat workaround, while crunching is AOK, I have not received a single new task.

. . I think that roll back is essential ........

Stephen

:(
ID: 2024733 · Report as offensive
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 2024741 - Posted: 23 Dec 2019, 22:57:27 UTC

There have been so many posts on this SETI problem that I may have missed an answer to the following question.
Does the staff at Berkeley think a resolution might come soon, or might it take more than a week?
From reading the posts, I believe that is the question for which everyone really needs an answer.
ID: 2024741 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2024742 - Posted: 23 Dec 2019, 23:03:41 UTC - in response to Message 2024741.  

There have been so many posts on this SETI problem that I may have missed an answer to the following question.
Does the staff at Berkeley think a resolution might come soon, or might it take more than a week?
From reading the posts, I believe that is the question for which everyone really needs an answer.

I believe this is the best answer available for your question: https://setiathome.berkeley.edu/forum_thread.php?id=84983&postid=2024655
ID: 2024742 · Report as offensive
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 2024745 - Posted: 23 Dec 2019, 23:23:04 UTC - in response to Message 2024742.  

Muchas gracias Juan.
That is exactly what I was looking for.
To save time for everyone who hasn't seen it, I will reprint Richard Hasselgrove's earlier post:

____________________________________________________________________________________
Posted: 23 Dec 2019, 16:55:49 UTC

For general information

I've had a provisional whisper back from the project. As of about an hour ago, the provisional plan is to go through normal maintenance and backup tomorrow (Tuesday), and then revert the database changes so we can go back to running the old server code.

My gloss on that is that it probably means a longer outage, and an even worse than usual recovery, but everyone should be able to enjoy the rest of the holiday in peace. Most especially, the staff can relax. Sounds like a good idea to me.
ID: 2024655 · Report as offensive \
Posted: 23 Dec 2019, 16:55:49 UTC

For general information

I've had a provisional whisper back from the project. As of about an hour ago, the provisional plan is to go through normal maintenance and backup tomorrow (Tuesday), and then revert the database changes so we can go back to running the old server code.

My gloss on that is that it probably means a longer outage, and an even worse than usual recovery, but everyone should be able to enjoy the rest of the holiday in peace. Most especially, the staff can relax. Sounds like a good idea to me.
____________________________________________________________________________________


p.s. Thank you, Richard!
ID: 2024745 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 2024756 - Posted: 24 Dec 2019, 0:54:31 UTC - in response to Message 2024655.  
Last modified: 24 Dec 2019, 1:14:10 UTC

For general information

I've had a provisional whisper back from the project. As of about an hour ago, the provisional plan is to go through normal maintenance and backup tomorrow (Tuesday), and then revert the database changes so we can go back to running the old server code.

My gloss on that is that it probably means a longer outage, and an even worse than usual recovery, but everyone should be able to enjoy the rest of the holiday in peace. Most especially, the staff can relax. Sounds like a good idea to me.

This sounds great. I wish them all the best with reverting to an earlier version of the server code.
ID: 2024756 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1858
Credit: 268,616,081
RAC: 1,349
United States
Message 2024757 - Posted: 24 Dec 2019, 1:05:02 UTC - in response to Message 2024756.  

For general information

I've had a provisional whisper back from the project. As of about an hour ago, the provisional plan is to go through normal maintenance and backup tomorrow (Tuesday), and then revert the database changes so we can go back to running the old server code.

My gloss on that is that it probably means a longer outage, and an even worse than usual recovery, but everyone should be able to enjoy the rest of the holiday in peace. Most especially, the staff can relax. Sounds like a good idea to me.

This sounds great. I wish them all the best with reverting to an earlier version of the server code. Hope what Eric said below isn't true
Message 2024312 - Posted: 22 Dec 2019, 8:10:05 UTC (Old 117 thread)
Unfortunately there a database change that renders the 7.09 server inoperable. :(

Hence the statement "revert the database"...
ID: 2024757 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 2024758 - Posted: 24 Dec 2019, 1:18:16 UTC - in response to Message 2024757.  

For general information

I've had a provisional whisper back from the project. As of about an hour ago, the provisional plan is to go through normal maintenance and backup tomorrow (Tuesday), and then revert the database changes so we can go back to running the old server code.

My gloss on that is that it probably means a longer outage, and an even worse than usual recovery, but everyone should be able to enjoy the rest of the holiday in peace. Most especially, the staff can relax. Sounds like a good idea to me.

This sounds great. I wish them all the best with reverting to an earlier version of the server code. Hope what Eric said below isn't true
Message 2024312 - Posted: 22 Dec 2019, 8:10:05 UTC (Old 117 thread)
Unfortunately there a database change that renders the 7.09 server inoperable. :(

Hence the statement "revert the database"...

Thanks for that I change to my original post
ID: 2024758 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 21572
Credit: 7,508,002
RAC: 20
United Kingdom
Message 2024759 - Posted: 24 Dec 2019, 1:20:27 UTC - in response to Message 2024757.  

Hence the statement "revert the database"...

Or more aptly:

Reverse the polarity of the neutron flow?

;-)

Keep searchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 2024759 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2024760 - Posted: 24 Dec 2019, 1:23:05 UTC
Last modified: 24 Dec 2019, 1:23:38 UTC

Another problem shared from BETA, some of these Host detail numbers have been frozen all day. Namely the APR numbers.
SETI@home v8 8.11 x86_64-apple-darwin (cuda42_mac)
Number of tasks completed 135
Max tasks per day 156
Number of tasks today 2873
Consecutive valid tasks 54
Average processing rate 1,740.18 GFLOPS

SETI@home v8 8.19 x86_64-apple-darwin (opencl_nvidia_mac_old)
Number of tasks completed 139
Max tasks per day 52
Number of tasks today 2000
Consecutive valid tasks 2
Average processing rate 1,768.32 GFLOPS

By this time, the cuda42 numbers should be quite a bit higher, but, they haven't changed.
https://setiathome.berkeley.edu/host_app_versions.php?hostid=6796479
ID: 2024760 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 2024761 - Posted: 24 Dec 2019, 1:34:26 UTC - in response to Message 2024745.  

Muchas gracias Juan.
That is exactly what I was looking for.
To save time for everyone who hasn't seen it, I will reprint Richard Hasselgrove's earlier post:

____________________________________________________________________________________
Posted: 23 Dec 2019, 16:55:49 UTC

For general information

I've had a provisional whisper back from the project. As of about an hour ago, the provisional plan is to go through normal maintenance and backup tomorrow (Tuesday), and then revert the database changes so we can go back to running the old server code.

My gloss on that is that it probably means a longer outage, and an even worse than usual recovery, but everyone should be able to enjoy the rest of the holiday in peace. Most especially, the staff can relax. Sounds like a good idea to me.
ID: 2024655 · Report as offensive \
Posted: 23 Dec 2019, 16:55:49 UTC

For general information

I've had a provisional whisper back from the project. As of about an hour ago, the provisional plan is to go through normal maintenance and backup tomorrow (Tuesday), and then revert the database changes so we can go back to running the old server code.

My gloss on that is that it probably means a longer outage, and an even worse than usual recovery, but everyone should be able to enjoy the rest of the holiday in peace. Most especially, the staff can relax. Sounds like a good idea to me.
____________________________________________________________________________________


p.s. Thank you, Richard!


My wishes with you!

p.s.
After another call to Santa I can verify with the the following:
I'm going to replace some GPUs and it seems so I can have time to tune them up too.

Nightey Nightey! [nati nati]
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 2024761 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2024768 - Posted: 24 Dec 2019, 2:34:19 UTC - in response to Message 2024761.  

My wishes with you!
p.s.
After another call to Santa I can verify with the the following:
I'm going to replace some GPUs and it seems so I can have time to tune them up too.
Nightey Nightey! [nati nati]

. . Can I have the discards ? ? :) . {joke}

. . Merry Christmas Petri!

Stephen

:)
ID: 2024768 · Report as offensive
Profile Oz
Avatar

Send message
Joined: 6 Jun 99
Posts: 233
Credit: 200,655,462
RAC: 212
United States
Message 2024771 - Posted: 24 Dec 2019, 3:32:49 UTC

Hi All

I haven't been very active the last couple of years or so, I thought I might fire up a box or two now that SAH has had the time to fix the software/server/WU availability issues that had been plaguing them...
Member of the 20 Year Club



ID: 2024771 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 37373
Credit: 261,360,520
RAC: 489
Australia
Message 2024772 - Posted: 24 Dec 2019, 3:33:11 UTC

After some interruptions and getting a couple of jobs done outside before that so called rain gets here (if it even bothers to turn up at all) I've just finished cleaning out all the dust, ash and soot from the last 3 months out of my 2500K rig as well as updating it so it's ready for when things come back up properly.

I'll do the other in the morning as it's getting far too warm here now and my back needs a rest while I need to enjoy a few beers. ;-)

Cheers.
ID: 2024772 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2024773 - Posted: 24 Dec 2019, 3:34:52 UTC

Uhh, don't know where you read that the project was back to normal. It is not. Won't be until after tomorrow's outage/backup/recovery. (Fingers crossed.)
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2024773 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13886
Credit: 208,696,464
RAC: 304
Australia
Message 2024801 - Posted: 24 Dec 2019, 8:19:38 UTC - in response to Message 2024773.  
Last modified: 24 Dec 2019, 8:20:25 UTC

Uhh, don't know where you read that the project was back to normal. It is not. Won't be until after tomorrow's outage/backup/recovery. (Fingers crossed.)
Maybe, hopefully, if we're very, very lucky.

It would still be nice if they sorted out what caused everything to slow down in the first place, was it the re-enabling of Resend lost tasks, or something else?
Because while Anonymous Platform systems can't get work, systems running stock were still getting the odd Scheduler error or "Project has no tasks available", even with the extremely reduced server load, which was why I decided to just go back to Anonymous Platform and wait for a fix (along with the BOINC Manager deciding CUDA42 was best- 30min+ for something SoG was doing in 6min 30s or 9min depending on the hardware).
Grant
Darwin NT
ID: 2024801 · Report as offensive
bluestar

Send message
Joined: 5 Sep 12
Posts: 7344
Credit: 2,084,789
RAC: 3
Message 2024806 - Posted: 24 Dec 2019, 8:59:57 UTC - in response to Message 2024801.  
Last modified: 24 Dec 2019, 9:00:23 UTC

Only a waste of time for the whole thing here, Grant, because the SoG CUDA tasks might report a bit more on the server, than the corresponding CUDA50 tasks here.
ID: 2024806 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19494
Credit: 40,757,560
RAC: 67
United Kingdom
Message 2024814 - Posted: 24 Dec 2019, 11:25:53 UTC - in response to Message 2024801.  
Last modified: 24 Dec 2019, 11:27:43 UTC

Uhh, don't know where you read that the project was back to normal. It is not. Won't be until after tomorrow's outage/backup/recovery. (Fingers crossed.)
Maybe, hopefully, if we're very, very lucky.

It would still be nice if they sorted out what caused everything to slow down in the first place, was it the re-enabling of Resend lost tasks, or something else?
Because while Anonymous Platform systems can't get work, systems running stock were still getting the odd Scheduler error or "Project has no tasks available", even with the extremely reduced server load, which was why I decided to just go back to Anonymous Platform and wait for a fix (along with the BOINC Manager deciding CUDA42 was best- 30min+ for something SoG was doing in 6min 30s or 9min depending on the hardware).

I don't know where you're getting the idea that returning to stock was a waste of time. I did it reasonably soon after the problem was identified and after a few attempts to get tasks and get back to crunching, all has gone as one would expect. There was a drop off in performance as the system had to go through the questionable decision which are the best apps. Once that was decided as being the "SETI@home v8 8.22 windows_intelx86 (opencl_nvidia_SoG)" for my GPU, I admit I aborted the remaining CUDA tasks and it has run the SoG app without breaks ever since. My cache is full and performance only down by ~15% as judged by RAC. And some of that might be because the computer has only grabbed one AP task in this period. This for Windows, obviously special sauce Linux performance is a different matter.
ID: 2024814 · Report as offensive
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 94 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (118)


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.