Panic Mode On (96) Server Problems?

Message boards : Number crunching : Panic Mode On (96) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 23 · Next

AuthorMessage
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1653933 - Posted: 18 Mar 2015, 0:14:28 UTC

Not a post in this thread in over a day, is this a cause for panic?
ID: 1653933 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 1653944 - Posted: 18 Mar 2015, 0:57:45 UTC - in response to Message 1653933.  

Not a post in this thread in over a day, is this a cause for panic?

Yes, it appears we have been trained not to post if the AP side of the project is down..
ID: 1653944 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1653994 - Posted: 18 Mar 2015, 5:01:32 UTC

Mmmm... sweet delicious APs..

2015-03-17 21:17:05 SETI@home Scheduler request completed: got 3 new tasks
2015-03-17 21:22:14 SETI@home Scheduler request completed: got 1 new tasks
2015-03-17 21:32:30 SETI@home Scheduler request completed: got 2 new tasks
2015-03-17 21:37:36 SETI@home Scheduler request completed: got 1 new tasks
2015-03-17 21:58:06 SETI@home Scheduler request completed: got 1 new tasks
2015-03-17 22:03:16 SETI@home Scheduler request completed: got 2 new tasks
2015-03-17 22:08:23 SETI@home Scheduler request completed: got 2 new tasks
2015-03-17 22:18:38 SETI@home Scheduler request completed: got 2 new tasks
2015-03-17 22:23:46 SETI@home Scheduler request completed: got 2 new tasks
2015-03-17 22:39:09 SETI@home Scheduler request completed: got 2 new tasks
2015-03-17 22:49:23 SETI@home Scheduler request completed: got 3 new tasks
2015-03-17 22:54:32 SETI@home Scheduler request completed: got 2 new tasks
2015-03-17 23:04:46 SETI@home Scheduler request completed: got 2 new tasks
2015-03-17 23:30:20 SETI@home Scheduler request completed: got 1 new tasks
2015-03-17 23:50:49 SETI@home Scheduler request completed: got 1 new tasks
2015-03-18 00:06:11 SETI@home Scheduler request completed: got 3 new tasks
2015-03-18 00:11:18 SETI@home Scheduler request completed: got 2 new tasks
2015-03-18 00:31:46 SETI@home Scheduler request completed: got 1 new tasks
2015-03-18 00:36:55 SETI@home Scheduler request completed: got 3 new tasks
2015-03-18 00:47:10 SETI@home Scheduler request completed: got 1 new tasks


I still have 1.4M seconds of cache to hoard--err.. fill, but it's getting there.. 1-3 tasks at a time.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1653994 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1654022 - Posted: 18 Mar 2015, 6:55:42 UTC - in response to Message 1653994.  

MB splitter output has been exceptionally crappy for the last few hours & is continuing on in that vein.
If there wasn't any AP available, we probably would have run out of MB by now; the output is that poor.
Grant
Darwin NT
ID: 1654022 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1654030 - Posted: 18 Mar 2015, 7:16:47 UTC - in response to Message 1654022.  
Last modified: 18 Mar 2015, 7:22:06 UTC

MB splitter output has been exceptionally crappy for the last few hours & is continuing on in that vein.
If there wasn't any AP available, we probably would have run out of MB by now; the output is that poor.

This could be partly to do with the fact that there is somewhere in the region of over 2 million results waiting to be administered have a lock on the tech news thread for details. I would like to say thanks to Matt, Jeff and Eric for the fantastic job they are doing keeping everything running
ID: 1654030 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 1654035 - Posted: 18 Mar 2015, 7:43:47 UTC - in response to Message 1654030.  

MB splitter output has been exceptionally crappy for the last few hours & is continuing on in that vein.
If there wasn't any AP available, we probably would have run out of MB by now; the output is that poor.

This could be partly to do with the fact that there is somewhere in the region of over 2 million results waiting to be administered have a lock on the tech news thread for details. I would like to say thanks to Matt, Jeff and Eric for the fantastic job they are doing keeping everything running

+1
Also worth noting that when the MB splitters are consistently churning out more than 31/sec it's more a demand issue than poor MB splitting.
ID: 1654035 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1654039 - Posted: 18 Mar 2015, 7:55:55 UTC - in response to Message 1654035.  

MB splitter output has been exceptionally crappy for the last few hours & is continuing on in that vein.
If there wasn't any AP available, we probably would have run out of MB by now; the output is that poor.

This could be partly to do with the fact that there is somewhere in the region of over 2 million results waiting to be administered have a lock on the tech news thread for details. I would like to say thanks to Matt, Jeff and Eric for the fantastic job they are doing keeping everything running

+1
Also worth noting that when the MB splitters are consistently churning out more than 31/sec it's more a demand issue than poor MB splitting.

Last SSP update, a bit more than 28/sec. But good to note that the assimilators knocked 10k off of the waiting to be assimilated queue since the prior update.

And hey, the kitties are getting APs!!!
Happy kitties...........

Meow!
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1654039 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1654042 - Posted: 18 Mar 2015, 8:10:13 UTC

These APs are tasty. "Number of tasks today: 78" and my 10-day cache is almost filled. Probably another 5-10 more and that should do it. Om nom nom nom..
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1654042 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 1654044 - Posted: 18 Mar 2015, 8:14:18 UTC - in response to Message 1654039.  

MB splitter output has been exceptionally crappy for the last few hours & is continuing on in that vein.
If there wasn't any AP available, we probably would have run out of MB by now; the output is that poor.

This could be partly to do with the fact that there is somewhere in the region of over 2 million results waiting to be administered have a lock on the tech news thread for details. I would like to say thanks to Matt, Jeff and Eric for the fantastic job they are doing keeping everything running

+1
Also worth noting that when the MB splitters are consistently churning out more than 31/sec it's more a demand issue than poor MB splitting.

Last SSP update, a bit more than 28/sec. But good to note that the assimilators knocked 10k off of the waiting to be assimilated queue since the prior update.

And hey, the kitties are getting APs!!!
Happy kitties...........

Meow!

Meow indeed.
I do still wonder about the wisdom of having half the splitters and download servers, all the assimilators, the db purge and all the transitioners living on the same physical box. Seems like it's certain that the whole system will be slow to recover from any backlogs, like we tend to see.
ID: 1654044 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1654048 - Posted: 18 Mar 2015, 8:25:20 UTC - in response to Message 1654044.  
Last modified: 18 Mar 2015, 8:26:21 UTC

MB splitter output has been exceptionally crappy for the last few hours & is continuing on in that vein.
If there wasn't any AP available, we probably would have run out of MB by now; the output is that poor.

This could be partly to do with the fact that there is somewhere in the region of over 2 million results waiting to be administered have a lock on the tech news thread for details. I would like to say thanks to Matt, Jeff and Eric for the fantastic job they are doing keeping everything running

+1
Also worth noting that when the MB splitters are consistently churning out more than 31/sec it's more a demand issue than poor MB splitting.

Last SSP update, a bit more than 28/sec. But good to note that the assimilators knocked 10k off of the waiting to be assimilated queue since the prior update.

And hey, the kitties are getting APs!!!
Happy kitties...........

Meow!

Meow indeed.
I do still wonder about the wisdom of having half the splitters and download servers, all the assimilators, the db purge and all the transitioners living on the same physical box. Seems like it's certain that the whole system will be slow to recover from any backlogs, like we tend to see.

Well, Matt has been in the lab pretty regularly now, and I know he has been keeping tabs on things much better than has been in a while. He has commented about moving certain processes from here to there.
I have faith that he knows what boxen can handle what workload. And if things go astray, he'll be on it.
If anybody can keep the Seti servers rolling, Matt is the one that can do it.
He also mentioned that fresh AP work was due to rollout soon.
What could be better than that, eh?

I am very happy that Matt has the time to be in the lab keeping tabs on things for a while....he happens to be Seti's best server guru. Eric is no slouch, of course, but I sense that Matt is the master of such thingys.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1654048 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1654134 - Posted: 18 Mar 2015, 14:43:29 UTC

The ap_11no14aa tape file does have the old problem with corrupted data on the _B3_P1_ channel which is seen as needing 100% blanking by Astropulse apps. Because that problem tends to last for a few weeks or months, I suspect all the other no14 tape files will show it too.
                                                                   Joe
ID: 1654134 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1654143 - Posted: 18 Mar 2015, 15:03:15 UTC

Has anyone noticed the GPU APs are being sent first?
I've noticed. I notice things such as that.
I even had to lower my cache settings to receive CPU APs, quite the opposite of the way it has been for the last 6 months or so.
Now if I can just get a few APs to last more than 5 or 10 minutes I might be able to build a cache...

;-)
ID: 1654143 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1654160 - Posted: 18 Mar 2015, 16:15:56 UTC - in response to Message 1654143.  

I did notice that I only got GPU WU's at first when they came back, but eventually got CPU WUs.
ID: 1654160 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1654191 - Posted: 18 Mar 2015, 17:42:05 UTC - in response to Message 1654134.  

The ap_11no14aa tape file does have the old problem with corrupted data on the _B3_P1_ channel which is seen as needing 100% blanking by Astropulse apps. Because that problem tends to last for a few weeks or months, I suspect all the other no14 tape files will show it too.
                                                                   Joe

Yeah, I got rid of a handful of those that I received last night. Offline-tested them, when they returned 100% blanked, suspended all tasks in my cache and resumed just those ones to get rid of them.

ap_05no14aa_B3_P1_00018_20150317_22015.wu_1
ap_10no14ab_B3_P1_00057_20150317_02539.wu_1
ap_10no14ab_B3_P1_00304_20150317_02539.wu_1
ap_11no14aa_B3_P1_00072_20150318_20542.wu_1
ap_11no14aa_B3_P1_00395_20150318_20542.wu_0
ap_12no14ab_B3_P1_00020_20150318_32121.wu_1
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1654191 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 1654285 - Posted: 18 Mar 2015, 21:30:03 UTC - in response to Message 1654048.  
Last modified: 18 Mar 2015, 21:30:48 UTC

I am very happy that Matt has the time to be in the lab keeping tabs on things for a while....he happens to be Seti's best server guru. Eric is no slouch, of course, but I sense that Matt is the master of such thingys.


I appreciate the comment and know you mean well :), but Eric and Jeff are smarter than I am - I just have more time than them to focus and work on these problems when I'm around.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1654285 · Report as offensive
Dad
Volunteer tester

Send message
Joined: 21 May 99
Posts: 44
Credit: 35,266,844
RAC: 10
United States
Message 1654294 - Posted: 18 Mar 2015, 22:05:21 UTC - in response to Message 1654285.  

Such modesty
ID: 1654294 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1654297 - Posted: 18 Mar 2015, 22:08:31 UTC - in response to Message 1654285.  

I am very happy that Matt has the time to be in the lab keeping tabs on things for a while....he happens to be Seti's best server guru. Eric is no slouch, of course, but I sense that Matt is the master of such thingys.


I appreciate the comment and know you mean well :), but Eric and Jeff are smarter than I am - I just have more time than them to focus and work on these problems when I'm around.

- Matt


Thanks anyways Matt.

I`m fully aware its lack of man power thats causing all this.


With each crime and every kindness we birth our future.
ID: 1654297 · Report as offensive
Darth Beaver Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 20 Aug 99
Posts: 6728
Credit: 21,443,075
RAC: 3
Australia
Message 1654307 - Posted: 18 Mar 2015, 22:48:04 UTC

Anybody know if the server page is updating , as it looks like the AP data base has packed it in and gone on another holiday .
ID: 1654307 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1654313 - Posted: 18 Mar 2015, 23:09:17 UTC - in response to Message 1654307.  

The SSP shows "[As of 18 Mar 2015, 23:00:03 UTC]", marvin is shown running, and there are only 5 AP WUs awaiting assimilation. I see no hint that there's anything wrong in that.
                                                                   Joe
ID: 1654313 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1654330 - Posted: 19 Mar 2015, 0:07:53 UTC - in response to Message 1654285.  
Last modified: 19 Mar 2015, 0:08:16 UTC


I appreciate the comment and know you mean well :), but Eric and Jeff are smarter than I am - I just have more time than them to focus and work on these problems when I'm around.

- Matt


MATT, don't say that, you know the REAL person in a chain is the one that can actually get work done. Sure some may be great coders (and know little science), some maybe great system administrator, Maybe some know the science to amazing limits and (oh hell) can't get that 12:00 from flashing on their VCR/DVD :)

Point is ... you may not be better than everyone... but you get work done ! I bet you can NOT name one person that is the best in the project, sure he does that better, she does that, in the end it is a team effort.
ID: 1654330 · Report as offensive
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 23 · Next

Message boards : Number crunching : Panic Mode On (96) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.