Panic Mode On (83) Server Problems?

Message boards : Number crunching : Panic Mode On (83) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 21 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1362909 - Posted: 30 Apr 2013, 18:34:28 UTC - in response to Message 1362860.  

+ 1, the splitters our next bottleneck?

It would appear so.
Just coming back from the outage, there is only 1 splitter running, and it is only producing 6 Wus/sec. With shorties we need at least 55/s. So we really need 10 PFB splitters, and there are only 6, and it's been a while since they've all been running. No wonder we keep running out of work, even while it is being split.
Grant
Darwin NT
ID: 1362909 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1362925 - Posted: 30 Apr 2013, 19:38:09 UTC - in response to Message 1362909.  

+ 1, the splitters our next bottleneck?

It would appear so.
Just coming back from the outage, there is only 1 splitter running, and it is only producing 6 Wus/sec. With shorties we need at least 55/s. So we really need 10 PFB splitters, and there are only 6, and it's been a while since they've all been running. No wonder we keep running out of work, even while it is being split.


All the splitters are now running, and they're producing less than half the rate of what is needed.
Grant
Darwin NT
ID: 1362925 · Report as offensive
andybutt
Volunteer tester
Avatar

Send message
Joined: 18 Mar 03
Posts: 262
Credit: 164,205,187
RAC: 516
United Kingdom
Message 1362958 - Posted: 30 Apr 2013, 21:14:15 UTC - in response to Message 1362925.  

Hungry GPU's sitting here doing nothing! Ho Hum!
ID: 1362958 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1362966 - Posted: 30 Apr 2013, 21:32:34 UTC

Just got a "Project down for maintenance" and a one hour backoff, so something's underway.

Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1362966 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24877
Credit: 3,081,182
RAC: 7
Ireland
Message 1362967 - Posted: 30 Apr 2013, 21:33:17 UTC

Hmmn, I get this....

30/04/2013 22:30:55 | SETI@home | Requesting new tasks for CPU
30/04/2013 22:30:57 | SETI@home | Scheduler request completed: got 0 new tasks
30/04/2013 22:30:57 | SETI@home | Project is temporarily shut down for maintenance

ID: 1362967 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 1362977 - Posted: 30 Apr 2013, 21:42:17 UTC - in response to Message 1362967.  

ID: 1362977 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1362986 - Posted: 30 Apr 2013, 22:35:48 UTC

My main cruncher just downloaded 98 GPU tasks, but my second one still can't get any.
ID: 1362986 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 1362987 - Posted: 30 Apr 2013, 22:38:07 UTC

ID: 1362987 · Report as offensive
Sakletare
Avatar

Send message
Joined: 18 May 99
Posts: 132
Credit: 23,423,829
RAC: 0
Sweden
Message 1362994 - Posted: 30 Apr 2013, 22:42:51 UTC

And what is this 90 Mbits/sec upload that's running 24/7, even when the project is down for maintenance?

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-211/gigabitethernet6_17&ranges=d%3Aw&view=Octets
ID: 1362994 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1362998 - Posted: 30 Apr 2013, 22:45:54 UTC - in response to Message 1362994.  

And what is this 90 Mbits/sec upload that's running 24/7, even when the project is down for maintenance?

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-211/gigabitethernet6_17&ranges=d%3Aw&view=Octets

I don't think anyone ever guaranteed that that link was exclusively for seti@home.
ID: 1362998 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1363013 - Posted: 30 Apr 2013, 23:08:57 UTC - in response to Message 1362998.  

The MB splitters are certainly having a hard time getting up to their usual speed.

Cheers.
ID: 1363013 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1363019 - Posted: 30 Apr 2013, 23:38:30 UTC - in response to Message 1362998.  

And what is this 90 Mbits/sec upload that's running 24/7, even when the project is down for maintenance?

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-211/gigabitethernet6_17&ranges=d%3Aw&view=Octets

I don't think anyone ever guaranteed that that link was exclusively for seti@home.

By the nature of a gigabit router interface, that link must be for SETI@home exclusively. My guess is the backup operation which is the reason for the Tuesday outage is sensibly being done to someplace outside the colocation facility.
                                                                   Joe
ID: 1363019 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 1363032 - Posted: 1 May 2013, 0:11:28 UTC

Still no work, but then I thought the outrage was over?
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1363032 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 1363059 - Posted: 1 May 2013, 1:46:33 UTC - in response to Message 1362998.  

And what is this 90 Mbits/sec upload that's running 24/7, even when the project is down for maintenance?

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-211/gigabitethernet6_17&ranges=d%3Aw&view=Octets

I don't think anyone ever guaranteed that that link was exclusively for seti@home.

No, still I'm seeing no work, the server says no work is available...
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1363059 · Report as offensive
Tom*

Send message
Joined: 12 Aug 11
Posts: 127
Credit: 20,769,223
RAC: 9
United States
Message 1363060 - Posted: 1 May 2013, 1:46:36 UTC
Last modified: 1 May 2013, 1:56:32 UTC

Never thought I'd say this -

Please Split some AP's to slow down the feeder, Link and Client systems.
ID: 1363060 · Report as offensive
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 1363061 - Posted: 1 May 2013, 2:53:07 UTC - in response to Message 1362998.  

And what is this 90 Mbits/sec upload that's running 24/7, even when the project is down for maintenance?

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-211/gigabitethernet6_17&ranges=d%3Aw&view=Octets

I don't think anyone ever guaranteed that that link was exclusively for seti@home.


then lets program the upload and download servers to report data rates and output on status page.
ID: 1363061 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1363062 - Posted: 1 May 2013, 3:05:59 UTC
Last modified: 1 May 2013, 3:12:09 UTC

What screams at me looking at the Munin graphs is that only when we ran out of AstroPulse units did the MB units start to plummet from 300K to 0. This suggests to me that a lot of GPU AstroPulse is being done and when that ran out the MB reserve was quickly devoured by hungry hungry GPUs.

It's interesting that 30 units/s creation rate isn't enough, at least during a shorty storm.
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1363062 · Report as offensive
ExchangeMan
Volunteer tester

Send message
Joined: 9 Jan 00
Posts: 115
Credit: 157,719,104
RAC: 0
United States
Message 1363073 - Posted: 1 May 2013, 3:30:20 UTC - in response to Message 1363062.  

What screams at me looking at the Munin graphs is that only when we ran out of AstroPulse units did the MB units start to plummet from 300K to 0. This suggests to me that a lot of GPU AstroPulse is being done and when that ran out the MB reserve was quickly devoured by hungry hungry GPUs.

It's interesting that 30 units/s creation rate isn't enough, at least during a shorty storm.

I noticed this too before the shutdown. When MB and AP units were both being split, things kind of behaved themselves and I got all the MB I could crunch. I sure hope this is fixed tomorrow.

The current server status page shows 6 MB splitters active. 8 were active earlier and couldn't keep up. This is very frustrating for dedicated Seti crunchers like me.

ID: 1363073 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1363106 - Posted: 1 May 2013, 6:51:32 UTC - in response to Message 1363062.  
Last modified: 1 May 2013, 7:00:02 UTC

It's interesting that 30 units/s creation rate isn't enough, at least during a shorty storm.

During a shorty storm 55/s is the minimum needed to meet demand.
The storm is over, but still the splitters aren't able to crank out enough work. For a while they were doing about 30/s (barely enough when there's a lot of VLARs in the mix). Now they've dropped down to less than 20.

Someone in the lab needs to take a look at what is going on- the splitters used to be able sustain 70/s no problems at all, now they can't even reach it as a peak.


EDIT- at least the shorty storm was over for a while. The work my systems were able to get after the outage while i was at work didn't have many shorties in it, but one of the systems was just able to get some more work (still nowhere near enough...) & it was almost all shorties.
Grant
Darwin NT
ID: 1363106 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1363109 - Posted: 1 May 2013, 7:02:26 UTC - in response to Message 1363106.  
Last modified: 1 May 2013, 7:02:49 UTC

It's interesting that 30 units/s creation rate isn't enough, at least during a shorty storm.

During a shorty storm 55/s is the minimum needed to meet demand.
The storm is over, but still the splitters aren't able to crank out enough work. For a while they were doing about 30/s (barely enough when there's a lot of VLARs in the mix). Now they've dropped down to less than 20.

Someone in the lab needs to take a look at what is going on- the splitters used to be able sustain 70/s no problems at all, now they can't even reach it as a peak.

Dunno...for 10-20 minutes, they were down to 10-15/second. You can see the resulting dip in the cricket graphs. I was kinda hoping they would add another one of them fancy new pfb splitters during today's outage, seeing as how the AP splitters have been far outrunning them as of late.
Last check, they were getting close to 30 again.

Might be something with the new pfb splitting thingy, although I think it was Richard that expressed the opinion he thought they might be a little faster than the old splitters.....hmmmmmm.
Might depend on the work they are splitting.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1363109 · Report as offensive
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 21 · Next

Message boards : Number crunching : Panic Mode On (83) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.