Panic Mode On (113) Server Problems?

Message boards : Number crunching : Panic Mode On (113) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 37 · Next

AuthorMessage
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1953816 - Posted: 5 Sep 2018, 8:55:47 UTC - in response to Message 1953778.  

good news... they loaded more data files to be split...
The new blc16 is from the same day (58227) as the data bomb blc11 files. Is that a problem?? what does the number after the blc denote??
Not so good news ... I just ran a sample 19 BLC16 tasks through ... 9 were not noise.

So we have a LOT more of this crap now :(


. . So far the noise bombs I am seeing are still blc11, and they seem to be reducing with the most recent downloads. So far blc16 seem OK on my machines.

Stephen

:)
ID: 1953816 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1953817 - Posted: 5 Sep 2018, 8:58:20 UTC - in response to Message 1953816.  

Yea, The blc16 noise seems to be better now. More like <30% noise now.
Which is good.
ID: 1953817 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1953821 - Posted: 5 Sep 2018, 11:05:03 UTC

Interesting. There was a lack of file deleter backlog after maintenance this time.
The graphs seem to indicate they ran the purge and deleter before they brought the system back up to give the system a running head start on recovery.
ID: 1953821 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 1953964 - Posted: 6 Sep 2018, 1:57:51 UTC

Hmmmm..... the website seems to be a bit laggy. 2-3 seconds to serve a new page.
A proud member of the OFA (Old Farts Association).
ID: 1953964 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1953982 - Posted: 6 Sep 2018, 2:59:02 UTC

There's a daily "glitch" in the responsiveness when some background process hogs all the cpu cycles.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1953982 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13859
Credit: 208,696,464
RAC: 304
Australia
Message 1953999 - Posted: 6 Sep 2018, 4:40:35 UTC

I finally spotted one of the longer than usual runtime WUs at the end of it's run.
On the CPU a VLAR WU of that type would generally run for around 1hr 30min. The usual non-VLAR version would run for around 1hr- 1hr 10min. These extra long non-VLAR WUs are taking 1hr 40min.
Grant
Darwin NT
ID: 1953999 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24920
Credit: 3,081,182
RAC: 7
Ireland
Message 1954836 - Posted: 11 Sep 2018, 18:47:55 UTC

Wow, is this a record?
ID: 1954836 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1954837 - Posted: 11 Sep 2018, 19:00:16 UTC - in response to Message 1954836.  

Wow, is this a record?

In what way?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1954837 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1954840 - Posted: 11 Sep 2018, 19:05:53 UTC

definitely one of the shortest outages I've seen. very nice. probably won't even need much recovery time.
ID: 1954840 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1954889 - Posted: 11 Sep 2018, 23:58:54 UTC - in response to Message 1954836.  

Wow, is this a record?


. . I don't know if it's a record but it ranks as one of the shortest outages I have seen ...

Stephen

:)
ID: 1954889 · Report as offensive
Dr Who Fan
Volunteer tester
Avatar

Send message
Joined: 8 Jan 01
Posts: 3350
Credit: 715,342
RAC: 4
United States
Message 1954914 - Posted: 12 Sep 2018, 4:28:42 UTC

Someone needs to fix Beta's stats and get them back in working order: @Beta >Export stats are from July 18th, 2018
ID: 1954914 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1956134 - Posted: 18 Sep 2018, 21:12:50 UTC

. . Well we have been back since about 8:15pm UTC as best I can tell, another 4 hour or less outage. Makes me feel a little spoilt. :)

Stephen

:)
ID: 1956134 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1956160 - Posted: 18 Sep 2018, 21:47:14 UTC

I agree. It is very nice to be back in the short outage (under 4 hours) is normal world. The panic thread hadn't been posted to in 6 days. How amazing!
ID: 1956160 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1956183 - Posted: 18 Sep 2018, 22:18:04 UTC - in response to Message 1956160.  

I agree. It is very nice to be back in the short outage (under 4 hours) is normal world. The panic thread hadn't been posted to in 6 days. How amazing!


. . Nice that it has gone from one of the busiest threads to one of the quieter ones.

Stephen

:)
ID: 1956183 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1956712 - Posted: 22 Sep 2018, 1:04:39 UTC

I expected them to add more data files to be split to the system before the weekend. They could add more tomorrow, so it isn't time to panic yet, but I thought I'd warm up the thread just in case. We definitely don't have enough to get us to Monday morning.
ID: 1956712 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1956717 - Posted: 22 Sep 2018, 2:19:58 UTC

Nope, we don't have enough. But Eric has been pretty good lately in coming in on holidays and weekends and loading more tapes. Or he has been able to get the colo staff to load more tapes on his direction or whatever.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1956717 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1956758 - Posted: 22 Sep 2018, 7:28:55 UTC
Last modified: 22 Sep 2018, 8:03:30 UTC

As I write the return rate is around 130,000. Is this a result of tasks finishing within 3 seconds/overflow results or are we just seeing a spate of short work units? On my 970 returning 1 at a time they (blc 06) are completing in about 6 minutes. In my opinion it is nice to be returning work so quickly. I know it will take a lot to process in real time but it gives me a sense of getting more done compared to returning 97,000 or so an hour.

Edit = fixed spelling error and added blc number
ID: 1956758 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1956786 - Posted: 22 Sep 2018, 13:44:19 UTC

Definitely shorter WUs than usual. The credit is running in the 50s for me.
ID: 1956786 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1956795 - Posted: 22 Sep 2018, 14:38:13 UTC - in response to Message 1956786.  
Last modified: 22 Sep 2018, 14:39:27 UTC

Definitely shorter WUs than usual. The credit is running in the 50s for me.


. . That's Credit (N)ew doing its thing. 50's for me as well.

Stephen

:(
ID: 1956795 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51481
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1956800 - Posted: 22 Sep 2018, 15:06:35 UTC

The splitter cache is looking a might thin.
I'll see if I can get somebody a message to reload it.

Meow.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1956800 · Report as offensive
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 37 · Next

Message boards : Number crunching : Panic Mode On (113) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.