Panic Mode On (84) Server Problems?

Message boards : Number crunching : Panic Mode On (84) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 21 · Next

AuthorMessage
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36791
Credit: 261,360,520
RAC: 489
Australia
Message 1384327 - Posted: 24 Jun 2013, 19:28:16 UTC - in response to Message 1384258.  

I wonder if the creation rate could be related to file 20jn12ac as it has been in its current state for way longer than is usual.


That file is still in SSP.


I sent a message to Eric asking about it, but have not received a response as of yet.

Looks like they re-kicked that file as it's back down to 13 from 14 but whether that will help it....

Cheers.
ID: 1384327 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1384411 - Posted: 25 Jun 2013, 0:35:36 UTC

Anyone else having problems getting work? Plenty of Multibeam work shows on the Server Status Page, but I've only got 1 plus 4 Astropulse tasks in the last 14 work requests. The others resulted in "no tasks available". I wonder if the feeder is working right.
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1384411 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36791
Credit: 261,360,520
RAC: 489
Australia
Message 1384412 - Posted: 25 Jun 2013, 0:38:08 UTC - in response to Message 1384411.  

Anyone else having problems getting work? Plenty of Multibeam work shows on the Server Status Page, but I've only got 1 plus 4 Astropulse tasks in the last 14 work requests. The others resulted in "no tasks available". I wonder if the feeder is working right.

There maybe another VLAR storm happening again.

Cheers.
ID: 1384412 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1384419 - Posted: 25 Jun 2013, 1:55:22 UTC

OK, so I'm not getting any work either.

What's broken now?

According to the server page, everything is OK.

Do you believe that? I don't. So at the very least, the server status is broken.

Bah!
ID: 1384419 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 31009
Credit: 53,134,872
RAC: 32
United States
Message 1384425 - Posted: 25 Jun 2013, 2:45:35 UTC - in response to Message 1384419.  

OK, so I'm not getting any work either.

What's broken now?

According to the server page, everything is OK.

Do you believe that? I don't. So at the very least, the server status is broken.

Bah!

2 of your computers are getting work, just one isn't. Look to it to make sure it is asking for work.


ID: 1384425 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36791
Credit: 261,360,520
RAC: 489
Australia
Message 1384451 - Posted: 25 Jun 2013, 5:06:07 UTC - in response to Message 1384425.  

Definitely something wrong somewhere, my 660's are already doing backup projects and my 550Ti's are about to do the same.

A feeder issue maybe?

Cheers.
ID: 1384451 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1384455 - Posted: 25 Jun 2013, 5:14:55 UTC - in response to Message 1384451.  
Last modified: 25 Jun 2013, 5:16:02 UTC

To add to the difficulty in getting work, there won't be any to get anyway- the splitters have stopped splitting again.
Even though the server staus page shows themn running & tonnes of data ready to be split.
Grant
Darwin NT
ID: 1384455 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1384460 - Posted: 25 Jun 2013, 5:28:02 UTC - in response to Message 1384457.  

Now I will get gastric problems, including flatulence, belching, abdominal bloating and pain, headache, pain, indigestion and dizziness.

I should hope so.

Grant
Darwin NT
ID: 1384460 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1384485 - Posted: 25 Jun 2013, 6:53:10 UTC - in response to Message 1384460.  

Now I will get gastric problems, including flatulence, belching, abdominal bloating and pain, headache, pain, indigestion and dizziness.

I should hope so.

With all of that going-on, I'm just glad he's in Sweden and not at my house.
ID: 1384485 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51478
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1384491 - Posted: 25 Jun 2013, 7:04:19 UTC - in response to Message 1384327.  
Last modified: 25 Jun 2013, 7:08:06 UTC

I wonder if the creation rate could be related to file 20jn12ac as it has been in its current state for way longer than is usual.


That file is still in SSP.


I sent a message to Eric asking about it, but have not received a response as of yet.

Looks like they re-kicked that file as it's back down to 13 from 14 but whether that will help it....

Cheers.

I did get a response from Eric this afternoon whilst I was at work. He was having some time off with dear Angela this weekend.

He said that when a dataset (he used the word 'tape') gets stuck like that, it DOES tie up a splitter, thereby reducing potential work production by 1/6th, as there are 6 enabled MB splitters at present.

I emailed him back, politely asking why then, was the offending dataset still languishing in the queue....LOL.

If what you are telling me is correct, it looks like he DID give it a kick, but it must have gotten stuck again.
I suspect when he gets my message tomorrow, he'll take another look at it and either kick it again or eject it.
Either way, it's on his radar now, and I am sure he'll keep an eye on it.

The moral of the story is, kitties......
If anybody notices a dataset 'stuck' like that for more than a couple of days, when everything else around it has been split and sent out, please feel free to PM me or at least bring it up in the forums. It DOES tie up a splitter when that happens. I'll try to get the message to Eric.

Meow.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1384491 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34380
Credit: 79,922,639
RAC: 80
Germany
Message 1384497 - Posted: 25 Jun 2013, 7:22:18 UTC

My GPU is rarely getting work also.
I want my VLAR`s back.



With each crime and every kindness we birth our future.
ID: 1384497 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51478
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1384499 - Posted: 25 Jun 2013, 7:26:21 UTC
Last modified: 25 Jun 2013, 7:28:55 UTC

I am wondering what the sustained inbound traffic to the servers as seen on the Cricket graphs for the last 10-11 hours is all about.....

It is unlike any 'normal' comms from the hosts, and it is also unlike any previous bursts of uploading new data from the lab.

DOS attack??

The kitties are curious.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1384499 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1384502 - Posted: 25 Jun 2013, 7:38:39 UTC - in response to Message 1384499.  

I am wondering what the sustained inbound traffic to the servers as seen on the Cricket graphs for the last 10-11 hours is all about.....

It is unlike any 'normal' comms from the hosts, and it is also unlike any previous bursts of uploading new data from the lab.

DOS attack??

The kitties are curious.


I was thinking it might be the usual data from the archive traffic, but limited by whatever is causing issues with the Scheduler/splitters/feeders.

There's plenty of work there, but it's just not being allocated. AP assimilators are backing up again. Ready-to-send buffer actually spiked much higher than usual before it topped out.
So something's gummed up the works.
Grant
Darwin NT
ID: 1384502 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51478
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1384504 - Posted: 25 Jun 2013, 7:44:09 UTC - in response to Message 1384502.  

I am wondering what the sustained inbound traffic to the servers as seen on the Cricket graphs for the last 10-11 hours is all about.....

It is unlike any 'normal' comms from the hosts, and it is also unlike any previous bursts of uploading new data from the lab.

DOS attack??

The kitties are curious.


I was thinking it might be the usual data from the archive traffic, but limited by whatever is causing issues with the Scheduler/splitters/feeders.

There's plenty of work there, but it's just not being allocated. AP assimilators are backing up again. Ready-to-send buffer actually spiked much higher than usual before it topped out.
So something's gummed up the works.

I just dunno....
Whatever is going on, it is not usual.

When you are on the Cricket graphs, click the 'long term' link.
There has not been any activity similar to this since the move to the colo.


"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1384504 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1384506 - Posted: 25 Jun 2013, 7:46:07 UTC - in response to Message 1384504.  

When you are on the Cricket graphs, click the 'long term' link.
There has not been any activity similar to this since the move to the colo.

Nope.
But then we haven't had Scheduler/feeder issues either since the move. Till now.
Grant
Darwin NT
ID: 1384506 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1384536 - Posted: 25 Jun 2013, 11:52:41 UTC - in response to Message 1384425.  

2 of your computers are getting work, just one isn't. Look to it to make sure it is asking for work.



The one not getting work is no longer online. It was GPU only. I recently acquired an i7-3820 and MB and 4x2GB of quad channel 2133MHz RAM (thanks, Craigslist!), and have replaced Unimatrix002 with I7-3820. Am now running 8 HT cores, with 4 CPU tasks and the other 4 reserved for graphics support (love those AP 6.04s!).

When I can get them, that is.

Right now, neither is getting work except for a (very) few APs overnight. I7-3820 is about to run out of work; Fermibox2 still has a bunch to do.
ID: 1384536 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1384542 - Posted: 25 Jun 2013, 12:26:43 UTC

Re. lack of AP work. Could it be that more and more switch to doing AP only (like me), due to the credits + there's now a stock linux GPU AP app?
ID: 1384542 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1384546 - Posted: 25 Jun 2013, 13:20:12 UTC - in response to Message 1384542.  

Re. lack of AP work. Could it be that more and more switch to doing AP only (like me), due to the credits + there's now a stock linux GPU AP app?

Maybe, but I note that on the server status page, the number of channels to do for MB hasn't changed since at least last night, and very few (if any) of AP have been processed either.

Definitely seems to be a splitter problem, at least at first glance.
ID: 1384546 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1384552 - Posted: 25 Jun 2013, 14:09:12 UTC

UPDATE: Since my last post, SSP says that 3 channels of AP and 3 of MB have been processed. Not very much.
ID: 1384552 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34380
Credit: 79,922,639
RAC: 80
Germany
Message 1384560 - Posted: 25 Jun 2013, 14:56:44 UTC - in response to Message 1384546.  
Last modified: 25 Jun 2013, 14:56:58 UTC

Re. lack of AP work. Could it be that more and more switch to doing AP only (like me), due to the credits + there's now a stock linux GPU AP app?

Maybe, but I note that on the server status page, the number of channels to do for MB hasn't changed since at least last night, and very few (if any) of AP have been processed either.

Definitely seems to be a splitter problem, at least at first glance.


I think its more a feeder problem since still over 300.000 V7 are ready to send.
I might be wrong on that.


With each crime and every kindness we birth our future.
ID: 1384560 · Report as offensive
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 21 · Next

Message boards : Number crunching : Panic Mode On (84) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.