The Server Issues / Outages Thread - Panic Mode On! (117)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (117)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 42 · 43 · 44 · 45 · 46 · 47 · 48 . . . 52 · Next

AuthorMessage
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2023781 - Posted: 19 Dec 2019, 23:26:22 UTC

Just got a "Project Down" message and while I have plenty of gpu/cpu tasks to munch on, I swear I still haven't cleared the log jam of "ready to reports" that I have possibly from Tuesday...

Tom
A proud member of the OFA (Old Farts Association).
ID: 2023781 · Report as offensive
Profile Freewill Project Donor
Avatar

Send message
Joined: 19 May 99
Posts: 766
Credit: 354,398,348
RAC: 11,693
United States
Message 2023797 - Posted: 20 Dec 2019, 0:36:37 UTC

They just disabled most everything again. Guess they're still trying to get it right. Hope this is resolved before Christmas. I'm not caching that much!
ID: 2023797 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11151
Credit: 29,581,041
RAC: 66
United States
Message 2023802 - Posted: 20 Dec 2019, 1:09:58 UTC

We are blessed, 2 outrages in a week is truly a blessing.
ID: 2023802 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2023803 - Posted: 20 Dec 2019, 1:11:52 UTC - in response to Message 2023802.  

We are blessed, 2 outrages in a week is truly a blessing.


. . Really ???

Stephen

. . <then thanking my lucky stars ...>

:)
ID: 2023803 · Report as offensive
Profile KW2E
Avatar

Send message
Joined: 18 May 99
Posts: 346
Credit: 104,396,190
RAC: 34
United States
Message 2023823 - Posted: 20 Dec 2019, 3:44:01 UTC - in response to Message 2023803.  

All better, for now. Sending and receiving as of about 03:00Z
ID: 2023823 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13373
Credit: 208,696,464
RAC: 304
Australia
Message 2023831 - Posted: 20 Dec 2019, 5:45:09 UTC - in response to Message 2023823.  
Last modified: 20 Dec 2019, 5:47:49 UTC

All better, for now. Sending and receiving as of about 03:00Z
Maybe for you.
Got home to no GPU work on my Linux system, looks like the Scheduler fell over again and the system went in to extreme backoff mode. At least now it's possible to report work, but with the splitter output well below the huge demand level, getting any work is a matter of luck.


Since Synergy runs the Scheduler process, and the Feeder, and it does the Validation work it might help things if they sorted out the issue with the Validation/Assimilation/Deletion & purging.
Results-returned-and-awaiting-validation is usually around 4-6 million. It's now at 9.6 million & heading for 10.
If that causes Synergy to choke, then that impacts the Scheduler, which is what has occurred for 2 days running now.


Edit- and they need to sort out the download issues as well. Getting more & more downloads sticking. Along with the very occasional uploads that take a second attempt to get through.

What did happen to that new download server?
Grant
Darwin NT
ID: 2023831 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2023848 - Posted: 20 Dec 2019, 9:04:25 UTC - in response to Message 2023831.  

Edit- and they need to sort out the download issues as well. Getting more & more downloads sticking. Along with the very occasional uploads that take a second attempt to get through.
What did happen to that new download server?


. . It's still ticking over (barely) in Beta ...

Stephen

? ?
ID: 2023848 · Report as offensive
Phil Burden

Send message
Joined: 26 Oct 00
Posts: 264
Credit: 22,303,899
RAC: 0
United Kingdom
Message 2023866 - Posted: 20 Dec 2019, 14:15:39 UTC

The replica database is getting further behind the master as time ticks by, which makes the server status somewhat of a lottery ;-)

P.
ID: 2023866 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13157
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2023879 - Posted: 20 Dec 2019, 16:40:46 UTC

The schedulers are being stingy handing out work again. No errors or anything but just getting one or two tasks each time instead of dozens. Caches dwindling.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2023879 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 813
Credit: 2,361,516
RAC: 22
United States
Message 2023882 - Posted: 20 Dec 2019, 16:58:27 UTC

The RTS has WUs, but it isn't giving my slow machine anything. That's ok, I have enough for another day, but I thought I would fill up (2 days) while the RTS was flush.
ID: 2023882 · Report as offensive
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3575
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 2023885 - Posted: 20 Dec 2019, 17:24:56 UTC

Dr. Korpela has posted in News:

It's the Friday before a holiday week and the servers know it.

The file system containing the beta project uploads directory is having problems, so beta is down until further notice.

This problem may be affecting the rate at which the main project can handle results, so the validation and assimilation queues are getting large, which may affect the rate of work generation.

ID: 2023885 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2023889 - Posted: 20 Dec 2019, 17:44:06 UTC - in response to Message 2023885.  
Last modified: 20 Dec 2019, 17:51:09 UTC

Something definitely wrong with the DataBase Try opening this link, I haven't been able to open it, https://setiathome.berkeley.edu/results.php?hostid=6906726&state=6
This one also fails to open, https://setiathome.berkeley.edu/results.php?hostid=6796479&state=6
ID: 2023889 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14509
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2023891 - Posted: 20 Dec 2019, 17:52:32 UTC - in response to Message 2023889.  

Something definitely wrong with the DataBase Try opening this link, I haven't been able to open it, https://setiathome.berkeley.edu/results.php?hostid=6906726&state=6
The basic task list opens fine (though it's big - 22,276 in all. It's failing to cope with the extra query term for the single error task, at least within a reasonable time. How far down the list is that error?
ID: 2023891 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2023893 - Posted: 20 Dec 2019, 18:01:56 UTC - in response to Message 2023891.  
Last modified: 20 Dec 2019, 18:30:52 UTC

I'm not having much luck looking through the list, it's very slow...

Found it, https://setiathome.berkeley.edu/result.php?resultid=8352959966
The 960 hasn't been happy since I installed it, I suppose I need to change the power connections. Maybe move it to the other power supply.
ID: 2023893 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13157
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2023899 - Posted: 20 Dec 2019, 18:22:37 UTC

I haven't been able to access my error task list since the project finally came back.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2023899 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4263
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2023910 - Posted: 20 Dec 2019, 18:36:24 UTC - in response to Message 2023889.  

Something definitely wrong with the DataBase Try opening this link, I haven't been able to open it, https://setiathome.berkeley.edu/results.php?hostid=6906726&state=6
This one also fails to open, https://setiathome.berkeley.edu/results.php?hostid=6796479&state=6


it loaded for me.

Task            Work unit	Sent	                        Time reported or deadline	Status	                Run time (sec)	CPU time (sec)	Credit	Application
8352959966 	3709054718 	20 Dec 2019, 6:12:22 UTC 	20 Dec 2019, 12:49:51 UTC 	Error while computing 	4,019.35 	3,090.63 	--- 	SETI@home v8 Anonymous platform (NVIDIA GPU)

Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2023910 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 2023929 - Posted: 20 Dec 2019, 21:44:21 UTC
Last modified: 20 Dec 2019, 21:44:34 UTC

So why does replica get still more behind?
ID: 2023929 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13157
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2023930 - Posted: 20 Dec 2019, 21:45:48 UTC - in response to Message 2023929.  
Last modified: 20 Dec 2019, 21:47:16 UTC

Down for maintenance again. Half the assimilators validators turned off.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2023930 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1635
Credit: 12,921,799
RAC: 89
New Zealand
Message 2023936 - Posted: 20 Dec 2019, 22:43:11 UTC - in response to Message 2023930.  

Down for maintenance again. Half the assimilators validators turned off.

On the bright side Keith at least it's not all of them
ID: 2023936 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13157
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2023939 - Posted: 20 Dec 2019, 22:56:46 UTC

I see that they brought the missing validators back up again after they were down. Don't see any improvement on validation tasks outstanding. They really need to whittle that down to something manageable. I haven't been able to pull up any task web page on any of my hosts since the outage. Don't know if the hosts are doing OK or not.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2023939 · Report as offensive
Previous · 1 . . . 42 · 43 · 44 · 45 · 46 · 47 · 48 . . . 52 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (117)


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.