The Server Issues / Outages Thread - Panic Mode On! (119)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (119)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 21 · 22 · 23 · 24 · 25 · 26 · 27 . . . 107 · Next

AuthorMessage
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2038502 - Posted: 17 Mar 2020, 15:55:33 UTC - in response to Message 2038491.  

Looks like the scheduler just died. My hosts get just failures when trying to connect.

All I get also.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2038502 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2038503 - Posted: 17 Mar 2020, 15:58:11 UTC - in response to Message 2038502.  

Looks like the scheduler just died. My hosts get just failures when trying to connect.

All I get also.


I get an occasional "reference site" servers may be down, but still getting through eventually. Almost 9 AM here.
17-Mar-2020 07:52:50 [SETI@home] Scheduler request completed: got 0 new tasks
17-Mar-2020 07:58:26 [SETI@home] Scheduler request completed: got 0 new tasks
17-Mar-2020 08:14:58 [SETI@home] Scheduler request completed: got 0 new tasks
17-Mar-2020 08:21:06 [SETI@home] Scheduler request completed: got 0 new tasks
17-Mar-2020 08:26:58 [SETI@home] Scheduler request completed: got 0 new tasks
17-Mar-2020 08:33:10 [SETI@home] Scheduler request completed: got 0 new tasks
17-Mar-2020 08:39:15 [SETI@home] Scheduler request completed: got 0 new tasks
17-Mar-2020 08:47:27 [SETI@home] Scheduler request completed: got 5 new tasks
17-Mar-2020 08:48:05 [SETI@home] Scheduler request completed: got 0 new tasks
17-Mar-2020 08:53:13 [SETI@home] Scheduler request completed: got 0 new tasks
ID: 2038503 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2038505 - Posted: 17 Mar 2020, 16:01:30 UTC

It is very slow though, like thrashing hard drives slow. Connection takes a lot longer. I might have the benefit of not timing out given I'm less that 100 miles away.
ID: 2038505 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2038506 - Posted: 17 Mar 2020, 16:04:14 UTC
Last modified: 17 Mar 2020, 16:05:07 UTC

And my connection is 50 Mbps up 1Gbps down:

3 173-219-243-100.suddenlink.net (173.219.243.100) 13.628 ms 11.351 ms 11.112 ms
4 173-219-235-186.suddenlink.net (173.219.235.186) 12.516 ms 16.874 ms 15.343 ms
5 173-219-251-43.suddenlink.net (173.219.251.43) 14.232 ms 16.272 ms 13.839 ms
6 64.125.41.222 (64.125.41.222) 69.339 ms 100.264 ms 70.240 ms
7 ae3.cr1.sjc2.us.zip.zayo.com (64.125.29.100) 66.483 ms 68.099 ms 64.884 ms
8 ae27.cs1.sjc2.us.eth.zayo.com (64.125.30.230) 66.153 ms 68.461 ms 68.212 ms
9 ae9.mpr1.pao1.us.zip.zayo.com (64.125.27.189) 65.607 ms 66.359 ms 66.006 ms
10 198.32.251.125 (198.32.251.125) 69.873 ms 78.558 ms 72.171 ms
11 dc-sfo-agg4--oak-agg4-100g.cenic.net (137.164.11.46) 78.276 ms 69.334 ms 69.449 ms
12 * * *
13 reccev-cev-cr1--et-0-0-0.net.berkeley.edu (128.32.0.67) 86.708 ms
sut-mdc-cr1--et-0-0-0.net.berkeley.edu (128.32.0.65) 71.164 ms 69.319 ms
14 e3-48.inr-310-ewdc.berkeley.edu (128.32.0.97) 71.545 ms
et3-47.inr-311-ewdc.berkeley.edu (128.32.0.103) 68.729 ms
et3-48.inr-311-ewdc.berkeley.edu (128.32.0.101) 69.598 ms
15 setifw.berkeley.edu (128.32.16.236) 67.521 ms 69.259 ms 68.076 ms
16 muarae1.ssl.berkeley.edu (208.68.240.110) 71.208 ms 68.922 ms 71.335 ms
ID: 2038506 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2038512 - Posted: 17 Mar 2020, 16:23:58 UTC

Still getting through though.

17-Mar-2020 09:03:31 [SETI@home] Scheduler request completed: got 0 new tasks
17-Mar-2020 09:14:43 [SETI@home] Scheduler request completed: got 37 new tasks
17-Mar-2020 09:15:04 [SETI@home] Scheduler request completed: got 0 new tasks
17-Mar-2020 09:20:12 [SETI@home] Scheduler request completed: got 33 new tasks
ID: 2038512 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2038513 - Posted: 17 Mar 2020, 16:39:04 UTC

My one windows box had been having issues, but had 4 tasks in the download queue which had been stuck retrying. After I processed those, all was well. Neither of my nix boxes was having issues.
ID: 2038513 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2038517 - Posted: 17 Mar 2020, 17:17:05 UTC
Last modified: 17 Mar 2020, 17:17:19 UTC

The number of results in the database is higher than what killed the servers yesterday but despite of that:

17-Mar-2020 19:14:24 [SETI@home] Scheduler request completed: got 148 new tasks
ID: 2038517 · Report as offensive     Reply Quote
AllgoodGuy

Send message
Joined: 29 May 01
Posts: 293
Credit: 16,348,499
RAC: 266
United States
Message 2038520 - Posted: 17 Mar 2020, 17:36:33 UTC - in response to Message 2038517.  

The number of results in the database is higher than what killed the servers yesterday but despite of that:

17-Mar-2020 19:14:24 [SETI@home] Scheduler request completed: got 148 new tasks


Finally. Glad to see you're getting things now.
ID: 2038520 · Report as offensive     Reply Quote
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 2038521 - Posted: 17 Mar 2020, 17:49:47 UTC

No outage again this Tuesday?
ID: 2038521 · Report as offensive     Reply Quote
Profile ravkin
Avatar

Send message
Joined: 14 Aug 09
Posts: 20
Credit: 11,165,042
RAC: 158
United States
Message 2038523 - Posted: 17 Mar 2020, 17:51:21 UTC - in response to Message 2038521.  

No outage again this Tuesday?

Yep, I guess no new tape-casettes will be added and we will chew on the validation backlog for 90 days.
ID: 2038523 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2038530 - Posted: 17 Mar 2020, 18:07:13 UTC - in response to Message 2038521.  

No outage again this Tuesday?


probably not https://news.berkeley.edu/coronavirus/
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2038530 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2038531 - Posted: 17 Mar 2020, 18:08:44 UTC

Should see some improvement on March 23 cause that is when thousands of my quorum=1 tasks that validated back at the end of January will have my wingmen time out or finally report their tasks.

That should reduce the size of the database I would hope for the last week of Seti.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2038531 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2038538 - Posted: 17 Mar 2020, 18:20:20 UTC - in response to Message 2038531.  

Should see some improvement on March 23 cause that is when thousands of my quorum=1 tasks that validated back at the end of January will have my wingmen time out or finally report their tasks.

That should reduce the size of the database I would hope for the last week of Seti.


or crash it from the resends lol
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2038538 · Report as offensive     Reply Quote
Kevin Olley

Send message
Joined: 3 Aug 99
Posts: 906
Credit: 261,085,289
RAC: 572
United Kingdom
Message 2038541 - Posted: 17 Mar 2020, 18:31:53 UTC - in response to Message 2038530.  

No outage again this Tuesday?


probably not https://news.berkeley.edu/coronavirus/


Not good.

It may stop the outrage but if anything goes wrong that cannot be fixed remotely it could stop everything.
Kevin


ID: 2038541 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2038550 - Posted: 17 Mar 2020, 19:19:44 UTC - in response to Message 2038538.  
Last modified: 17 Mar 2020, 19:20:22 UTC

Should see some improvement on March 23 cause that is when thousands of my quorum=1 tasks that validated back at the end of January will have my wingmen time out or finally report their tasks.
That should reduce the size of the database I would hope for the last week of Seti.
or crash it from the resends lol
Tasks that have reached their quorum won't be resent. Those workunits have already been validated and assimilated and are stuck in 'waiting for db purging' state until the missing result can be purged.

But I'm still wondering how can Keith have thousands of those when I have only two.
ID: 2038550 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2038556 - Posted: 17 Mar 2020, 19:47:18 UTC - in response to Message 2038550.  

Should see some improvement on March 23 cause that is when thousands of my quorum=1 tasks that validated back at the end of January will have my wingmen time out or finally report their tasks.
That should reduce the size of the database I would hope for the last week of Seti.
or crash it from the resends lol
Tasks that have reached their quorum won't be resent. Those workunits have already been validated and assimilated and are stuck in 'waiting for db purging' state until the missing result can be purged.

But I'm still wondering how can Keith have thousands of those when I have only two.

Don't know. Just use the offset in the valid tasks URL to look at the last few pages of each host. Most are from around 30 January and have an expiration of 23 March.

OK, exaggerated a bit. Only one host with the last 6 pages being the ones I was speaking of.
https://setiathome.berkeley.edu/results.php?hostid=8030022&offset=17680&show_names=0&state=4&appid=
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2038556 · Report as offensive     Reply Quote
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2038582 - Posted: 17 Mar 2020, 22:04:14 UTC
Last modified: 17 Mar 2020, 22:43:03 UTC

I also have pages of them on my faster Hosts, https://setiathome.berkeley.edu/results.php?hostid=6906726&offset=22280&show_names=0&state=4. They WILL be resent, I've run across a couple "minimum quorum = 1" WUs on a different machine that have already had a couple resent. The One Wingman didn't agree with the One Overflow marked Valid, so, the task was changed to Inconclusive and another task sent to another Host. Seems just about everyone has some of these minimum quorum = 1 WUs, some more than others. I'd guess a Large chunk of the Server problems could be solved by simply running a script that changed All minimum quorum = 1 WUs to completed and Validated so they can be removed from the Database.
Another one with pages of them, https://setiathome.berkeley.edu/results.php?hostid=6813106&offset=40380&show_names=0&state=4&appid=
ID: 2038582 · Report as offensive     Reply Quote
Dave Stegner
Volunteer tester
Avatar

Send message
Joined: 20 Oct 04
Posts: 540
Credit: 65,583,328
RAC: 27
United States
Message 2038591 - Posted: 17 Mar 2020, 22:53:09 UTC

Looking at my olders pendings, I found this interesting computer:

https://setiathome.berkeley.edu/show_host_detail.php?hostid=8473422

4 core processor with 1 gpu and 15000 tasks ???

His oldest pending was returned 3/12.

He received thousands of tasks today.

No wonder things are not working
Dave

ID: 2038591 · Report as offensive     Reply Quote
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34887
Credit: 261,360,520
RAC: 489
Australia
Message 2038593 - Posted: 17 Mar 2020, 22:59:41 UTC - in response to Message 2038591.  

Looking at my olders pendings, I found this interesting computer:

https://setiathome.berkeley.edu/show_host_detail.php?hostid=8473422

4 core processor with 1 gpu and 15000 tasks ???

His oldest pending was returned 3/12.

He received thousands of tasks today.

No wonder things are not working
That looks like someone who hasn't set their antivirus to ignore their BOINC folders.

Cheers.
ID: 2038593 · Report as offensive     Reply Quote
Dave Stegner
Volunteer tester
Avatar

Send message
Joined: 20 Oct 04
Posts: 540
Credit: 65,583,328
RAC: 27
United States
Message 2038594 - Posted: 17 Mar 2020, 23:02:32 UTC - in response to Message 2038593.  

How did he ever get 15000+ tasks ???
Dave

ID: 2038594 · Report as offensive     Reply Quote
Previous · 1 . . . 21 · 22 · 23 · 24 · 25 · 26 · 27 . . . 107 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (119)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.